Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
(I love it when signing up to post a message dumps the message I already typed up!)
Hi everyone,
I have an issue with importing CSVs very slowly. My workflow is this:
1. Process txt files in Power Automate to split out the CSV table portion and save to another location as a csv file (both local and on SharePoint)
2. Use Power BI to import data from the CSV files into my dataset
The files themselves are all consistent in design (9 columns, between 7200 - 40,000 rows) and all get the same process applied to combine into one large table with only 7 columns that are calculated from the original columns - none of the original columns are added to the dataset.
Though my largest file is ~38k rows and ~3MB, the import process balloons up to 100+ GB of filename.csv. Most of the other files end up in the several GBs during import even though they are less than 1MB. Is this normal behavior?
The whole process also takes about 30 minutes to process the 20 files I currently have totalling less than 18MB. When this is all set up, I will have hundreds of files that need to be processed.
Note: I've also tried to do this in Power BI Online and it took me 1hr23min to complete an import of the same dataset from SharePoint Online.
So, a couple questions:
1. Is it normal for CSV import to appear to load something much larger than the actual CSV file size?
2. How can I speed up this import process? I've already tried disabling the autodetection of column types and date/time formatting
3. If I'm already optimized on the free tier (through a company Microsoft 365 tenant), what sort of speeds can I expect out of paying to make this faster?
Thanks in advance.
Solved! Go to Solution.
Hi @mwegener and @v-easonf-msft,
It turns out that it was a rookie mistake. I was processing my data during the import process which was causing the import to be ridiculously slow. I've since imported the CSVs as is and am applying the calculations to the imported data, which has reduced the import process from 90 minutes to < 10 seconds!
Lesson learned, now just to figure out how to get my formulas to work again (since I now have combined data in one table instead of individual tables).
Thanks for the feedback, though!
yes...Detail please!
Hi , @andygoes
As mentioned in this document, the maximum file size you can import into Power BI is 1 gigabyte.
When you try to publish to service, even with a Pro License, a single PBIX file is limited to 1GB (you can have a total of 10GB saved under one account, but an individual file is limited to 1GB). You might actually be able to publish it, but you will not be able to refresh the data online.
And in the import mode, large data will take a long time to load. So it is generally recommended to convert to directquery mode.
You may take a try to load csv to SQL Server and use DirectQuery in Power BI Desktop to connect to the SQL data source.
Best Regards,
Community Support Team _ Eason
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
I'm plagued with slow CSV files that are work fine intially but after a PowerBI crash start taking forever to load. I've got them on onedrive. Do you have any links for how to use directquery?
Hi @mwegener and @v-easonf-msft,
It turns out that it was a rookie mistake. I was processing my data during the import process which was causing the import to be ridiculously slow. I've since imported the CSVs as is and am applying the calculations to the imported data, which has reduced the import process from 90 minutes to < 10 seconds!
Lesson learned, now just to figure out how to get my formulas to work again (since I now have combined data in one table instead of individual tables).
Thanks for the feedback, though!
andygoes, are you conducting the calculations/transformations in a separate query which references your combined table query or are you doing the calculations in DAX once the source table has been imported? It would be helpful If you could share your new process.
- H
Hi... rookie here too. How can I know if I'm processing the data before or after the import? I'm having blocking issues on refreshing CSV files from a folder (around 20 files, 13-50mb each) and maybe I'm doing the same mistake but I can't see it.
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
111 | |
95 | |
80 | |
68 | |
59 |
User | Count |
---|---|
150 | |
119 | |
104 | |
87 | |
67 |