Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Do you think I have unreasonable expectations for using SharePoint folders as a datasource with this large number of files and folder structure? I am curious to know if others have had success with similar situation.
I am doing query development in Excel Power Query and Power BI Desktop query and then copy pasting in to Power BI Online Service Dataflow
I am experimenting using a SharePoint folder with 4,000 csv files as a datasource. Files are all small 5 to 20 kb size. Files are in nested structure no more than 3 levels deep eg main folder \ subfolder 1 \ subfolder 2. Total rows for all 4,000 files is about 390,000 rows. Files have between 3 to 10 columns.
I have the exact same content on my local drive to use as a local datasource for development purposes runs very fast:
1) A query that returns a basic list of files finishs in seconds.
2) Another query that uses function & parameter to extract rows from binary returns all rows from the 4,000 files (about 390,000 rows) in less than 10 seconds.
However, when using the Sharepoint folder as a datasource:
1) The query that returns a basic list of files takes up to 10 minutes to finish but most often times out.
2) The other query that uses function & parameter to extract rows from binary returns all rows from the 4,000 files (about 390,000 rows) has never finished, always timing out returning "Error: DataSource.Error: SharePoint: Request failed: " error.
This is the case regardless of using Excel Power Query, Power BI query or Power BI Online Service Dataflow.
Ideas? Suggestions?
Solved! Go to Solution.
Found useful blog post that addresses this issue
https://exceleratorbi.com.au/power-bi-and-sharepoint-terrible-together/
Specifically, more sub-folders increases refresh time required compared to only have files in one top level folder
Found useful blog post that addresses this issue
https://exceleratorbi.com.au/power-bi-and-sharepoint-terrible-together/
Specifically, more sub-folders increases refresh time required compared to only have files in one top level folder
Hi @009co ,
Does the replies above solve your problem? If it has been solved, please mark the correct reply as the standard answer to help the other members find it more quickly.Thanks in advance for your kind cooperation!
Hope it helps,
Community Support Team _ Caitlyn
If this post helps then please consider Accept it as the solution to help the other members find it more quickly.
hi Caitlyn, not answered yet, would love to hear more ideas and suggestions : )
Hi @009co
What I would suggest is using the Table.Buffer which could potentially save you a lot of time as this will buffer the results and not have to go back and get the data multiple times?
Another option which I very often use is to install OneDrive client onto the Gateway Server. Then sync the folder to OneDrive. This allows you to then use the files as if they were local files?
Will look into buffering. Haven't used it in a while but can see its potential.
Interesting idea using OneDrive sync to Gateway Server.
Thanks!
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.