Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Why does refreshing 1,1mln rows from csv on Sharepoint folder takes about 1hour,
whilst refreshing the same data saved locally takes 1min?
Hi Gordon2,
If your data source is online such as in onedrive or sharepoint online, they use automatic refresh. This means no user configuration is necessary in order for the dataset to be refreshed on a regular basis. Data refresh settings are configured for you by Power BI. For online service providers, refresh usually occurs once-a-day. For files loaded from OneDrive, automatic refresh occurs about every hour for data that does not come from an external data source. While you can configure different schedule refresh settings and manually refresh, you probably don’t need to.
Regards,
Jimmy Tao
Thanks Jimmy
Unfortunatelly it does not refreshed automaticly. I even do not want this to be refreshed automaticaly.
Every month my client puts csv file on his ondrive/sharepoint which I have to recconect to power bi dashboard.
This file is now about 1.1mln rows and weights above 1GB. It takes more then hour to refresh (if I do not get time out error).
My assumption is the connector to sharepoint folder is very very inefficient. Refreshing the same file from my local computer is not a problem. It running quite fast. So it is not a dataset size issue. Do not tell me please to amend architecture to store data in a different place like database or azure storage. I know this, but at this moment we use Onedrive/Sharepoint as a data storage.
It is quite common practice but I do not understad why it is not so efficient? Both are Microsofts ones, ya?
I find quite a few discussions here on forum, people brought similar questions. Many of them with no sollution given.
@Anonymous,
Try clicking Options-> Data Load-> Unclick all the options in Background Data, Time intelligence and Relationships, do click the Parallel loading of tables like below:
Regards,
Jimmy Tao
Thanks Jimmy
Let me add some thoughts. I repeated my tests after one month and got totally different results:
tests made in mid of Dec (PBI Nov 2018 release) -> it takes +60min to import to pbi desktop
tests made in mid of Jan (PBI Dec 2018 release) - > it takes about 5min to import to pbi desktop
I suppose big performance improvments in Dec 18 PBI desktop release.
I have also tested the settings you suggested and here are results.
With my original settings I got exact time of 5min 24 sec to import data:
With a new settings I got 5min 09sek to import data:
It is not a big diff but it is something. 🙂
I started to compress csv files to csv.gz
It gives significant reduction in size and loading times.
With some tricks Power BI can connect to such file.
I am using the November 2018 version of desktop and experiencing refresh timeouts trying to load a csv file from SharePoint Online. The file has 17,000+ records and is 45Mb in size. I noted that this is bigger than the 30Mb that will open in Excel on-line. I opened the file and saved as an xlsx file in the same SharePoint folder. It now only takes a few seconds to refresh. The file size now though is only 2.25Mb in size. So it seems I can either wait for the organisation to update the desktop version, or change the csv files to xlsx.
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
111 | |
100 | |
80 | |
64 | |
58 |
User | Count |
---|---|
148 | |
111 | |
93 | |
84 | |
66 |