Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
Gjakova
Post Patron
Post Patron

Why is Power BI SO SLOW?!?

I'm currently working with some Excel file, all the Excel files combined are 728kb, NOT EVEN 1MB, but Power BI is taking forever when I refresh my data or try to merge tables together.

Does someone know why this would be the case? I applied a lot of steps to transform my data the way I like it, but I hope that it is not the reason why Power BI is struggling with a 700kb file?!?!?

 

1 ACCEPTED SOLUTION

I saw your M code. 

 

I don't see anything earth-shatterningly wrong. What is happening though is you have roughly 70 steps going through 100,000 rows as you indicated. You have about a dozen REPLACE steps. Those steps go through all 100,000 rows every single time. I think that is probably the biggest single issue. You are also doing a lot of split columns, which is also time consuming on a flat file of that size.

 

The Replace Steps could mostly be replaced by merges to a DIM table and expanding what you need, but at this point I'dhave to have the file to really dig in. I know for a 100% fact this can be optimized, but I don't want to over promise, working with flat files in the 100,000+ record range can be slow if you are doing a lot of transformations.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

View solution in original post

14 REPLIES 14
edhans
Super User
Super User

We would need to see the transformations. If you do a lot of merges, create structured (nested) columns, etc., you can get Power Query into a knot.

 

If you post your M code we might be able to see what is wrong, but may need your source and PBIX file if you can remove any confidential data.

 

I can get Power Query to go south with just a few thousand rows of data inefficiently transformed.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

Unfortunalety I can't share my PBIX file due to confidential data, but here are some screens of my applied steps on my fact table:

Gjakova_0-1595887478393.pngGjakova_1-1595887505646.pngGjakova_2-1595887535494.png

Because I unpivoted my fact table, it went from 20+ columns and 1000 rows to 7 columns and 100.000+ rows, since then I didn't do any shocking transformations (I think), but above you can have a look on all the steps I have taken so far...

Kind regards!

No. I need the M code.

In the advanced editor (home ribbon), CTRL-A, then paste it here in the code box. Use the </> icon. Sort of the reverse of the steps listed below that let you use M code posted here.

 



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

I saw your M code. 

 

I don't see anything earth-shatterningly wrong. What is happening though is you have roughly 70 steps going through 100,000 rows as you indicated. You have about a dozen REPLACE steps. Those steps go through all 100,000 rows every single time. I think that is probably the biggest single issue. You are also doing a lot of split columns, which is also time consuming on a flat file of that size.

 

The Replace Steps could mostly be replaced by merges to a DIM table and expanding what you need, but at this point I'dhave to have the file to really dig in. I know for a 100% fact this can be optimized, but I don't want to over promise, working with flat files in the 100,000+ record range can be slow if you are doing a lot of transformations.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

I mean 100,000 is not a very large dataset, for example python's pandas would handle 70 replace operations on strings of arbitrary size in under a minute.

I do not see why Power BI would not use Threading under the hood to do these columnwise operations? However if there is a memory shortage, or processor bottleneck that could lead to long load times.

In general when working with large datasets the practice is trim what you don't need by dropping columns or filtering data, Transform Columns groupping as many like transformations together. Then Do anything that would make you data fatter (more memory) like splitting columns.  That being said these are guidelines that will be broken for necessity.


One suggestion is convert from a flat file type to parquet then import that.

ckeo
Frequent Visitor

Look to pass some transformations to Python. PowerBI is inherently slow - and it looks like your passing a ton a steps.

Anonymous
Not applicable

Instead of these many steps inside Power BI, try to create a View in the source and use that View as the source with all necessary actions such as Rename, Split, etc..

@Anonymous what do you mean exactly with creating a "view" in the source? I am using Excel and csv files for this. So I can only transform it in Power BI itself, or do you mean something else? Thanks in advance!

"Those steps go through all 100,000 rows every single time." @edhans 

 

Hmm... I did not know that. Thanks for the tip! So what would you advice to a beginner like me when working with such datasets? Because when I duplicate the fact table and remove other columns to create a dimension table, all the previous steps are still registered. Is there a better solution to solve my issue in Power BI, so that I can avoid doing all those steps (or at least don't mess up the performance of Power BI.


@Gjakova wrote:

"Those steps go through all 100,000 rows every single time." @edhans 

 

Hmm... I did not know that. Thanks for the tip! So what would you advice to a beginner like me when working with such datasets? Because when I duplicate the fact table and remove other columns to create a dimension table, all the previous steps are still registered. Is there a better solution to solve my issue in Power BI, so that I can avoid doing all those steps (or at least don't mess up the performance of Power BI.


I'm not clear on what you mean by the steps are still registered.

 

It really comes down to optimizing the code so steps are repeated as little as possible. I would also like to see how your FACT table is being split off into DIM tables. If you are doing 70- transformations then creating 3 DIM tables, those 3 DIM tables force the full 70 transformations to run each time. Could there be a way to split off the relevant data with Refrences in Power Query first, then do only what you need for each DIM table, then finalize your FACT table with the necessary steps.

 

But we are talking a lot of theory here. It is very hard to help without seeing the file. I fully understand your confidentiality issues. I just want you to know why I am not providing a bullet point list of things to do. I don't have enough info to do that.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

I understand your point. I would've loved to share it, but I'm not allowed to. But what you've just mentioned already make a lot of sense. I did not know that all those applied steps would weight so heavy on Power BI.

To answer your first question. When I duplicate a query all the applied steps are registred, so it does not start from blank. Would you recommend to 'Reference' a query instead of 'Duplicating' it? What is the exact difference between those two and what is the better solution to work to a star model?

Thanks a lot!

A duplicate creates a separate copy. Changes made to the original no longer work in the copy.

A reference creates a link. Changes to the original also affect the referenced query.

Neither impact the utility of a Star Schema. They are just how you get there. You want the most efficient way possible. I rely a lot on references and less so on duplicates, but that doesn't mean references are better. I think they are better for the places I use them in. But I will use the duplicate feature for some models.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

Thank you for your answers. It gave me a good overall view of how PBI works under the hood. I definitely need to learn more about it, but better late than never I suppose 🙂

I will take your tips into my current and future projects. If you have any other tips, feel free to share! Thanks again.

Glad I was able to help, at least with some understanding of what is happening @Gjakova 

Best wishes on your project.

 



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.