Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!
Hi folks,
If you import a csv in Power BI, then in Power Query select to remove columns, does Power BI have to read all the columns first as it processes the file, and then remove them?
This is a question because I've got users consuming on-prem files with 5M+ rows/20+ columns. They only need 5 of the columns for example. When I publish the report to the Power BI service and then schedule the refresh I'm guessing my Power BI Gateway is having to read all the columns from the file, then delimit it, then remove the unused as requested in the Power BI step. That puts additional load on the gateway (?). If the data was in a SQL Server database then we could just restrict the columns at the database level via the SELECT list and/or Power BI would do it via query folding if we didn't directly specify the SQL statement for the import.
Any input is appreciated
Hi @eskyline
To speed up performance of power query, please refer to:
Table.Buffer for cashing intermediate query results or how workaround Unnecessary Queries Issue
Performance Tip for Power BI; Enable Load Sucks Memory Up
Best Regards
Maggie
Community Support Team _ Maggie Li
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Hello @eskyline
i think that when the data source is a file there is no way to speed up. So no query folding etc.
Do you any other transforming on the data like Table.AddColumn etc?
If yes, then try to put Table.Buffer around your 2nd step
If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too
Have fun
Jimmy