Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
andygoes
Frequent Visitor

Slow CSV import

(I love it when signing up to post a message dumps the message I already typed up!)

 

Hi everyone,

 

I have an issue with importing CSVs very slowly. My workflow is this:

 

1. Process txt files in Power Automate to split out the CSV table portion and save to another location as a csv file (both local and on SharePoint)

2. Use Power BI to import data from the CSV files into my dataset

 

The files themselves are all consistent in design (9 columns, between 7200 - 40,000 rows) and all get the same process applied to combine into one large table with only 7 columns that are calculated from the original columns - none of the original columns are added to the dataset.

 

Though my largest file is ~38k rows and ~3MB, the import process balloons up to 100+ GB of filename.csv. Most of the other files end up in the several GBs during import even though they are less than 1MB. Is this normal behavior?

 

The whole process also takes about 30 minutes to process the 20 files I currently have totalling less than 18MB. When this is all set up, I will have hundreds of files that need to be processed.

 

Note: I've also tried to do this in Power BI Online and it took me 1hr23min to complete an import of the same dataset from SharePoint Online.

 

So, a couple questions:

1. Is it normal for CSV import to appear to load something much larger than the actual CSV file size?

2. How can I speed up this import process? I've already tried disabling the autodetection of column types and date/time formatting

3. If I'm already optimized on the free tier (through a company Microsoft 365 tenant), what sort of speeds can I expect out of paying to make this faster?

 

Thanks in advance.

1 ACCEPTED SOLUTION

Hi @mwegener and @v-easonf-msft,

 

It turns out that it was a rookie mistake. I was processing my data during the import process which was causing the import to be ridiculously slow. I've since imported the CSVs as is and am applying the calculations to the imported data, which has reduced the import process from 90 minutes to < 10 seconds!

 

Lesson learned, now just to figure out how to get my formulas to work again (since I now have combined data in one table instead of individual tables).

 

Thanks for the feedback, though!

View solution in original post

7 REPLIES 7
watertight
New Member

yes...Detail please! 

v-easonf-msft
Community Support
Community Support

Hi , @andygoes 

As mentioned in this document, the maximum file size you can import into Power BI is 1 gigabyte.

When you try to publish to service, even with a Pro License, a single PBIX file is limited to 1GB (you can have a total of 10GB saved under one account, but an individual file is limited to 1GB). You might actually be able to publish it, but you will not be able to refresh the data online. 

 

And in the import mode, large data will take a long time to load. So it is generally recommended to convert to directquery mode.

You may  take a try to load csv to SQL Server and use DirectQuery in Power BI Desktop to connect to  the SQL data source.

 

Best Regards,
Community Support Team _ Eason
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

 

 

 

I'm plagued with slow CSV files that are work fine intially but after a PowerBI crash start taking forever to load.   I've got them on onedrive.  Do you have any links for how to use directquery?

Hi @andygoes ,

 

you can share a screenshot of your transformation steps.

 

 

Did I answer your question?
Please mark my post as solution, this will also help others.
Please give Kudos for support.

Marcus Wegener works as Full Stack Power BI Engineer at BI or DIE.
His mission is clear: "Get the most out of data, with Power BI."
twitter - LinkedIn - YouTube - website - podcast


Hi @mwegener and @v-easonf-msft,

 

It turns out that it was a rookie mistake. I was processing my data during the import process which was causing the import to be ridiculously slow. I've since imported the CSVs as is and am applying the calculations to the imported data, which has reduced the import process from 90 minutes to < 10 seconds!

 

Lesson learned, now just to figure out how to get my formulas to work again (since I now have combined data in one table instead of individual tables).

 

Thanks for the feedback, though!

andygoes, are you conducting the calculations/transformations in a separate query which references your combined table query or are you doing the calculations in DAX once the source table has been imported? It would be helpful If you could share your new process.

 

- H

Anonymous
Not applicable

Hi... rookie here too. How can I know if I'm processing the data before or after the import? I'm having blocking issues on refreshing CSV files from a folder (around 20 files, 13-50mb each) and maybe I'm doing the same mistake but I can't see it.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.