Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Grow your Fabric skills and prepare for the DP-600 certification exam by completing the latest Microsoft Fabric challenge.

Reply
Quique
Frequent Visitor

Run Script for all Lakehouse Files / Lakehouse files storage account key

I would like to use the Notebook to run the same script for all files stored in the Lakehouse files. How can this be done? If there´s a need to get the storage account key for lakehouse files, where can this be obtained?

1 ACCEPTED SOLUTION
alxdean
Advocate V
Advocate V

You can easily list all files in a Lakehouse folder from your notebook. Just make sure to connect the lakehouse in the left pane first. that will then become your default lakehouse in the notebook
 
file_path = f"/lakehouse/default/Files/yoursubfoldername_here"
lst = os.listdir(file_path)
 
then you can iterate through the files and apply the same logic to each file in a loop. 
 
for file in lst:
    --do stuff
 
So the trick is not to run the same notebook on each file, but to read out the files in the notebook and the run the same logic on each file. 
 
you could create a pipeline that reads out the list of files in a folder and then call a notebook and pass the file path as a parameter, but that just adds unnecessary complexity.
 
if you're looking for the fully qualified path to where your files are in the lakehouse, then you can grab it from the properties on the Files folder
alxdean_0-1709155088632.png

 

View solution in original post

4 REPLIES 4
Quique
Frequent Visitor

Thanks! I'm sure this works and almost worked for me, but for some reason I keep getting this error Spark_Ambiguous_MsSparkUtils_UseMountedPathFailure. 

I'll keep checking.

 
alxdean
Advocate V
Advocate V

You can easily list all files in a Lakehouse folder from your notebook. Just make sure to connect the lakehouse in the left pane first. that will then become your default lakehouse in the notebook
 
file_path = f"/lakehouse/default/Files/yoursubfoldername_here"
lst = os.listdir(file_path)
 
then you can iterate through the files and apply the same logic to each file in a loop. 
 
for file in lst:
    --do stuff
 
So the trick is not to run the same notebook on each file, but to read out the files in the notebook and the run the same logic on each file. 
 
you could create a pipeline that reads out the list of files in a folder and then call a notebook and pass the file path as a parameter, but that just adds unnecessary complexity.
 
if you're looking for the fully qualified path to where your files are in the lakehouse, then you can grab it from the properties on the Files folder
alxdean_0-1709155088632.png

 

BoSe
Frequent Visitor

For me it only works when i use the /lakehouse/default/Files... Path.
However when i try to use it with the abfs path, do get the following Error:

FileNotFoundError: [Errno 2] No such file or directory: 'abfss://.../input'

 

file_path = f"abfss://.../input"
lst = os.listdir(file_path)
lst

 

Any idea what causes that issue?

UPDATE: It worked now, thanks again! There were a couple of problems: the delta tables I was creating had blank spaces in their names. Also, when creating the delta tables, I changed to use the qualified path, instead of the relative path. 

Helpful resources

Announcements
Europe Fabric Conference

Europe’s largest Microsoft Fabric Community Conference

Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.

RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

MayFBCUpdateCarousel

Fabric Monthly Update - May 2024

Check out the May 2024 Fabric update to learn about new features.

Top Kudoed Authors