Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more.
Get startedGrow your Fabric skills and prepare for the DP-600 certification exam by completing the latest Microsoft Fabric challenge.
Hi,
I know I can do this in a data pipeline copy activity but I'd like to be able to download a binary file from SFTP within a notebook.
I can download CSV using paramiko decoding it to a string and then writing it to one lake but when I try and write anything other than string data it fails saying it can't find the path. I'd like to be able to just do a direct binary copy.
Has anyone got a code snippet that would work?
Thanks
Ben
Solved! Go to Solution.
Thank you! That link gave me the clue I needed. I'm downloaded data from SFTP using:
## Main SFTP Function
def download_files_from_sftp(hostname, port, username, password, local_directory, remote_directory):
## Initialize the SSH client
ssh_client = paramiko.SSHClient()
ssh_client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
try:
## Connect to the server
ssh_client.connect(hostname, port=port, username=username, password=password)
## Initialize SFTP client
with ssh_client.open_sftp() as sftp:
## List files in the specified directory
file_list = sftp.listdir(f"{remote_directory}")
for file in file_list:
remote_filepath = f"{remote_directory}/{file}"
local_filepath = os.path.join(local_directory, file)
## Check if it is a file and not a directory
if sftp.stat(remote_filepath).st_mode & 0o170000 == 0o100000: # 0o100000 is the mask for a regular file
## Download the file
## Reference lakehouse as local storage using: "/lakehouse/default/Files/FolderName/filename.csv"
sftp.get(remote_filepath, local_filepath)
print(f"Downloaded {file}")
except Exception as e:
print(f"Error: {e}")
finally:
## Close the SSH client
if ssh_client: ssh_client.close()
The missing part was correctly referencing the attached default lakehouse as if it was local storage.
By using a path in the format: "/lakehouse/default/Files/FolderName/filename.csv" the problems I was having went away!
Obviously if you're using the code above make sure you're storing the passwords etc as secrets in Azure Key Vault and access them using
sftp_password = mssparkutils.credentials.getSecret("https://keyvaultname.vault.azure.net/", "Secret-Name")
Thank you! That link gave me the clue I needed. I'm downloaded data from SFTP using:
## Main SFTP Function
def download_files_from_sftp(hostname, port, username, password, local_directory, remote_directory):
## Initialize the SSH client
ssh_client = paramiko.SSHClient()
ssh_client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
try:
## Connect to the server
ssh_client.connect(hostname, port=port, username=username, password=password)
## Initialize SFTP client
with ssh_client.open_sftp() as sftp:
## List files in the specified directory
file_list = sftp.listdir(f"{remote_directory}")
for file in file_list:
remote_filepath = f"{remote_directory}/{file}"
local_filepath = os.path.join(local_directory, file)
## Check if it is a file and not a directory
if sftp.stat(remote_filepath).st_mode & 0o170000 == 0o100000: # 0o100000 is the mask for a regular file
## Download the file
## Reference lakehouse as local storage using: "/lakehouse/default/Files/FolderName/filename.csv"
sftp.get(remote_filepath, local_filepath)
print(f"Downloaded {file}")
except Exception as e:
print(f"Error: {e}")
finally:
## Close the SSH client
if ssh_client: ssh_client.close()
The missing part was correctly referencing the attached default lakehouse as if it was local storage.
By using a path in the format: "/lakehouse/default/Files/FolderName/filename.csv" the problems I was having went away!
Obviously if you're using the code above make sure you're storing the passwords etc as secrets in Azure Key Vault and access them using
sftp_password = mssparkutils.credentials.getSecret("https://keyvaultname.vault.azure.net/", "Secret-Name")
Hi @bcdobbs ,
Glad to know that your query got resolved. Please continue using Fabric Community on your further queries.
Hi @bcdobbs ,
Thanks for using Fabric Community.
Can you please check this similar thread - Solved: Re: Copy a file from an URL to OneLake - Microsoft Fabric Community .
I think you might get some ideas after looking into that thread.
Please let me know if you have further queries.
Hello @bcdobbs ,
We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet .
In case if you have any resolution please do share that same with the community as it can be helpful to others .
Otherwise, will respond back with the more details and we will try to help .
Ask questions in Data Engineering, Data Science, Data Warehouse and General Discussion.
Ask questions in Eventhouse and KQL, Eventstream, and Reflex.
User | Count |
---|---|
13 | |
8 | |
3 | |
3 | |
3 |