Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more.
Get startedGrow your Fabric skills and prepare for the DP-600 certification exam by completing the latest Microsoft Fabric challenge.
Hi there!
So for some stupid reasons that can't really be changed for now we have reporting data exclusively in thousands of CSV files that date back all the way to 2021. In total we are talking about ~53k files for the last 4 years. I'm not sure about how many rows of data we are talking, but I'm expecting it to be ~1-2 million, not more. So nothing Power BI can't handle easily.
However, automated updates have failed to work a long time ago, due to the sheer amount of files being pulled each time, which obviously (at least to me?) calls for some partitioning and incremental updates.
I implemented a very basic partitioning setup, archiving anything that is older than 4 weeks from today. Whether that makes alot of sense or not - the initial partitioning fails again for the same reasons: time out bc too much to handle. At least that's what I'm getting.
My question: What are good resources for dealing with that problem? MS documentation is giving me a headache, YouTube is being way to general about it (at least from what I found after hours of digging) and the cases in the community forum here are either too specific or starting at a more expert level than I am currently (or both).
If anyone could share some good how-tos or courses about partitioning and maybe even about my problem described above I'd really appreciate it. I would haved jumped to Fivrr and just have someone else deal with it, but where's the learning in that? I mean, I will probably end up hiring someone anyways to get it up and running, but I still wanna mess with it in our Sandbox to become a better PowerBI person.
Thanks in advance!
Solved! Go to Solution.
For incremental refresh, your source needs to handle query folding (CSVs do not). However, you can create a datamart to take care of this: Incremental Refresh for Non-Query Folding Sources in Power BI Desktop (mssqltips.com)
Proud to be a Super User! | |
Hi, @doaks
Thanks for the reply from @audreygerred , please allow me to provide addition:
Before you can partition or incrementally update your CSV file, you need to effectively merge your CSV file into Power BI. You can use Power Query to merge files to optimize this process, especially if the files have a similar structure. For guidance on this initial step, see the documentation on using CSV files in Power BI:
Get data from comma separated value (CSV) files - Power BI | Microsoft Learn
Once the data is in Power BI, the next step is to set up incremental refresh to query and update only the data that has changed or been added since the last refresh. This is critical for working with large datasets and avoiding timeouts. Power BI's incremental refresh feature can help you do this. Here's a detailed guide on how to configure an incremental refresh policy:
How to Get Your Question Answered Quickly
Best Regards
Yongkang Hua
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Hi, @doaks
Thanks for the reply from @audreygerred , please allow me to provide addition:
Before you can partition or incrementally update your CSV file, you need to effectively merge your CSV file into Power BI. You can use Power Query to merge files to optimize this process, especially if the files have a similar structure. For guidance on this initial step, see the documentation on using CSV files in Power BI:
Get data from comma separated value (CSV) files - Power BI | Microsoft Learn
Once the data is in Power BI, the next step is to set up incremental refresh to query and update only the data that has changed or been added since the last refresh. This is critical for working with large datasets and avoiding timeouts. Power BI's incremental refresh feature can help you do this. Here's a detailed guide on how to configure an incremental refresh policy:
How to Get Your Question Answered Quickly
Best Regards
Yongkang Hua
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
For incremental refresh, your source needs to handle query folding (CSVs do not). However, you can create a datamart to take care of this: Incremental Refresh for Non-Query Folding Sources in Power BI Desktop (mssqltips.com)
Proud to be a Super User! | |
User | Count |
---|---|
83 | |
69 | |
68 | |
65 | |
53 |
User | Count |
---|---|
94 | |
92 | |
91 | |
77 | |
70 |