Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Anonymous
Not applicable

power query refresh really slow- excel file

Hi Everyone,

New to Power BI and Power Query here.

I'm trying to combine 10 excel files in a folder (each one is about 15 MB), when I apply changes, it took me like 4hours to refresh the query and used all of the memory and CPU!!

I've unchecked a few boxes in Settings to help speed up the load and read countless threads about buffer and folding but they don't seem to be applicable for me or I simply don't know how to use it.

 

Can anyone please take a look at my query and let me know how I can fix it?

 

Many Thanks!!!

 

let
Source=Folder.Files("M:\Dept\Global Interim Data Set (GIDS)"),
#"Removed Other Columns" = Table.SelectColumns(Source,{"Content", "Name"}),
#"Added Custom" = Table.AddColumn(#"Removed Other Columns", "Custom", each Excel.Workbook([Content])),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Content"}),
#"Expanded Custom" = Table.ExpandTableColumn(#"Removed Columns", "Custom", {"Name", "Data", "Item", "Kind", "Hidden"}, {"Custom.Name", "Custom.Data", "Custom.Item", "Custom.Kind", "Custom.Hidden"}),
#"Filtered Rows1" = Table.SelectRows(#"Expanded Custom", each Text.Contains([Custom.Name], "Client") and not Text.Contains([Custom.Name], "$")),
#"Removed Columns1" = Table.RemoveColumns(#"Filtered Rows1",{"Custom.Name", "Custom.Kind", "Custom.Hidden"}),
#"obtain date" = Table.AddColumn(#"Removed Columns1","Date", each Text.Select([Name],{"0".."9"})),
#"Removed Columns2" = Table.RemoveColumns(#"obtain date",{"Name", "Custom.Item"}),
SkipRows = Table.AddColumn(#"Removed Columns2","Skip first 4",each Table.PromoteHeaders(Table.Skip([Custom.Data],4))),
#"Removed Columns4" = Table.RemoveColumns(SkipRows,{"Custom.Data"}),
unioncol = List.Union(
List.Buffer(
List.Transform(#"Removed Columns4"[Skip first 4], each Table.ColumnNames(_)))),
ExpandCol = Table.Buffer(
Table.ExpandTableColumn(#"Removed Columns4", "Skip first 4",unioncol)),
RemoveEmpty = FnRemoveEmptyColumns(ExpandCol),
ColName = Table.ColumnNames(RemoveEmpty), 

#"Removed Duplicates" = Table.Buffer(Table.Distinct(RemoveEmpty, List.Range(Table.ColumnNames(RemoveEmpty),1,162)))
in
#"Removed Duplicates"

2 ACCEPTED SOLUTIONS
dax
Community Support
Community Support

Hi ddm123, 

I find a similar post for this problem, you could try to refer to Any way to speed up really slow refresh times in Power Query? | Power Query | Excel Forum  for details.

Best Regards,
Zoe Zhi

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

Hello @Anonymous 

 

I undestand. Then I can't figure out how these 4 lines of everey sheet are removed after you've combined them.

I would go for

- Create one query that threahts your file like you need it

- Create a parameter with one path/name of a sample path

- replace the filename in your first query with the parameter and convert it to a function

- use the data source to extract the file names of this folder... in case do some filtering

- add new column where you use your funciton and path and name as input.

 

That should do the trick

 

If this post helps or solves your problem, please mark it as solution.
Kudos are nice to - thanks
Have fun

Jimmy

View solution in original post

6 REPLIES 6
dax
Community Support
Community Support

Hi ddm123, 

I find a similar post for this problem, you could try to refer to Any way to speed up really slow refresh times in Power Query? | Power Query | Excel Forum  for details.

Best Regards,
Zoe Zhi

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Anonymous
Not applicable

Hi @dax , 

 

I read the post and converted my file to .csv that did decrease the refreshing time to ~3 hours (slightly increase). is there any other way to speed it up??? Maybe I'm asking for too much....

 

Hello @Anonymous 

 

very difficult to get an idea what exactly happens here.

- Columns were expanded and without use deleted afterwards,

- by extracting a plain xlsx-file i don't get how a column reference can be a Table in this case for "Table.Skip" on row level - SkipRows = Table.AddColumn(#"Removed Columns2","Skip first 4",each Table.PromoteHeaders(Table.Skip([Custom.Data],4))),

- a customfunction is used (not knowing what it does) - maybe use table.buffer in there

- I don't know if Table.Buffer at the end is very usefull

 

Maybe a change of concept in retrieving data would be best (maybe to create function to query the files, and put the data together at the end other change logic - difficult to answer if the real goal is not known.

 

BR

Jimmy

 

Anonymous
Not applicable

Hi @Jimmy801 , 

 

thanks so much for the response. I put the "Table.Skip" because for all the files i'm combining the first 4 rows were not needed to transfer to the query, also the cutsom function was used to remove all the empty columns for the files. I will try to create a function to connect the data to see how much of the difference it will make. Each file contains like over 100 columns and by using power query I'm trying to combine the data together as the files in the folder will be added on a monthly basis, then build a data model out of the flat files with 100 columns here. 

I'm just so frustrated that it takes so long to combine the files every time I refresh the data model. One other thing is it seems like for every step added on PQ, the data was retrieved from the very first step that I don't really understand why and I think that's another issue that causes the data super slow to load. by adding Table.Buffer step I thought the purpose is to cache the table in the memory to accelerate the data load. However, I realized that when adding a step after the buffer step, the data was still pulling from the be beginning... Am I understand how table.bufffer works wrong?

 

 

 

Hello @Anonymous 

 

I undestand. Then I can't figure out how these 4 lines of everey sheet are removed after you've combined them.

I would go for

- Create one query that threahts your file like you need it

- Create a parameter with one path/name of a sample path

- replace the filename in your first query with the parameter and convert it to a function

- use the data source to extract the file names of this folder... in case do some filtering

- add new column where you use your funciton and path and name as input.

 

That should do the trick

 

If this post helps or solves your problem, please mark it as solution.
Kudos are nice to - thanks
Have fun

Jimmy

Anonymous
Not applicable

Hi @Jimmy801 , 

 

Very good suggestion. It decreased the loading time a lot!!!

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors