Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
KimS1231
Frequent Visitor

Need help with reading files in a folder

I'm trying to pull specific data from cells in excel workbooks. However, in the folder is a system file called thumbs.db. I cannot delete this as it says it is in use. No matter how I try to filter the files (I tried telling the filter to ignore thumbs.db, only filter for .xlsx, etc.) I want the query to read, it hits the thumbs.db file and just stops. It gets to the step for adding a custom column where it is reading the sheets in the excel files and since thumbs.db doesn't fit that scenario, it stops working. Is there a way to work around this?

 

Here are the steps in the query:

Source = Folder.Files("P:\XXX\XXXXXXX"),
#"Filter for xlsx files only" = Table.SelectRows(#"Filter out .tmp & .db files", each [Extension] = ".xlsx"),
#"Added Custom Column" = Table.AddColumn(#"Filter for xlsx files only", "Sheets", each Excel.Workbook(File.Contents([Folder Path]&[Name]))),
#"Expanded Sheets" = Table.ExpandTableColumn(#"Added Custom Column", "Sheets", {"Name", "Data", "Item", "Kind", "Hidden"}, {"Sheets.Name", "Sheets.Data", "Sheets.Item", "Sheets.Kind", "Sheets.Hidden"}),
#"Expanded Sheets.Data" = Table.ExpandTableColumn(#"Expanded Sheets", "Sheets.Data", {"Column2", "Column3"}, {"Sheets.Data.Column2", "Sheets.Data.Column3"}),
#"Filter for Budget Summary Sheet" = Table.SelectRows(#"Expanded Sheets.Data", each [Sheets.Item] = "Budget Summary"),
#"Filter for Outside Sales Comm" = Table.SelectRows(#"Filter for Budget Summary Sheet", each Text.StartsWith([Sheets.Data.Column2], "Outside"))

2 ACCEPTED SOLUTIONS
Vijay_A_Verma
Super User
Super User

You should replace this line 

#"Filter for xlsx files only" = Table.SelectRows(#"Filter out .tmp & .db files", each [Extension] = ".xlsx"),

with

#"Filter for xlsx files only" = Table.SelectRows(Source, each [Extension] = ".xlsx"),

View solution in original post

v-yingjl
Community Support
Community Support

Hi @KimS1231 ,

thumbs.db are hidden files created when using thumbnail previews for folders in File Explorer.

 

If you can confirm that the query has filtered the xlsx files in Power Query, you can try to delete the file by the following articles and check it again:

How to delete network folder with thumbs.db file on Windows 10 

 

Best Regards,
Community Support Team _ Yingjie Li
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

 

View solution in original post

4 REPLIES 4
v-yingjl
Community Support
Community Support

Hi @KimS1231 ,

thumbs.db are hidden files created when using thumbnail previews for folders in File Explorer.

 

If you can confirm that the query has filtered the xlsx files in Power Query, you can try to delete the file by the following articles and check it again:

How to delete network folder with thumbs.db file on Windows 10 

 

Best Regards,
Community Support Team _ Yingjie Li
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

 

Vijay_A_Verma
Super User
Super User

In your PQ steps, can you check that at this step - #"Filter for xlsx files only" , you see .xlsx files only and Thumbs.db is getting filtered out?

KimS1231
Frequent Visitor

Hello Vijay

I did the replacement and it is still erroring out. 

 

Here is the new code snip:

Source = Folder.Files("P:\XXXX\XXXXXX"),
#"Filter for xlsx files only" = Table.SelectRows(Source, each [Extension] = ".xlsx"),
#"Added Custom Column" = Table.AddColumn(#"Filter for xlsx files only", "Sheets", each Excel.Workbook(File.Contents([Folder Path]&[Name]))),
#"Expanded Sheets" = Table.ExpandTableColumn(#"Added Custom Column", "Sheets", {"Name", "Data", "Item", "Kind", "Hidden"}, {"Sheets.Name", "Sheets.Data", "Sheets.Item", "Sheets.Kind", "Sheets.Hidden"}),
#"Expanded Sheets.Data" = Table.ExpandTableColumn(#"Expanded Sheets", "Sheets.Data", {"Column2", "Column3"}, {"Sheets.Data.Column2", "Sheets.Data.Column3"}),
#"Filter for Budget Summary Sheet" = Table.SelectRows(#"Expanded Sheets.Data", each [Sheets.Item] = "Budget Summary"),
#"Filter for Outside Sales Comm" = Table.SelectRows(#"Filter for Budget Summary Sheet", each Text.StartsWith([Sheets.Data.Column2], "Outside"))

 

When I close and load, I get this message: [DataFormat.Error] File contains corrupted data

It's referring to thumbs.db as all other files in the folder are excel files of type .xlsx. As soon as it tries to expand the columns, it errors out. 

Vijay_A_Verma
Super User
Super User

You should replace this line 

#"Filter for xlsx files only" = Table.SelectRows(#"Filter out .tmp & .db files", each [Extension] = ".xlsx"),

with

#"Filter for xlsx files only" = Table.SelectRows(Source, each [Extension] = ".xlsx"),

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors