Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Grow your Fabric skills and prepare for the DP-600 certification exam by completing the latest Microsoft Fabric challenge.

Reply
radoslavov91
New Member

Need help to speed up folder source

I have a Folder that received a report from a machine in our factory (working time, working rounds, etc.).

Every new day 00:01AM the machine creates a new report file in a new folder, and every folder is named with the date of the report (all this is automatically done).

 

radoslavov91_0-1715671657473.png

In every folder thereis one one file 

radoslavov91_1-1715671690459.png

Each file is between 400KB and 4MB which is between 400 and 4000 rows 

In Power Bi i'm trying to load all files from the Laser folder, which folder contains multiple folders for each day of the year.

radoslavov91_2-1715672010825.png

Then i'm filtering only the .csv files (because there also some system generated files .ini which i want to make sure are not loaded or cause error.

radoslavov91_3-1715672125856.png

Then i filter only folders for year 2024 (and respectively only the files for 2024

radoslavov91_4-1715672168948.png

These are my full query settings where i rename some columns, remove unneccessary columns, 

radoslavov91_5-1715672205414.png

radoslavov91_6-1715672417693.png

Then i expand the files to get the machine results (working time)

radoslavov91_7-1715672528376.png

Then i do some further correction such as renaming some columns to make more sense, removing uneccessarey columns 

radoslavov91_8-1715672633564.png

radoslavov91_9-1715672650218.png

Mostly this is it, i don't create any calculation columns or measurements.

 

I don't need to tell you how long it take for the file to loads, so my question is, if you guys have any suggestions how to optimaze this file to load the data faster, combine quesries, or any other tips and tricks that you can suggest to overall make this more efficiant. 

 

It is important to mention that i cannot change the way files are generated by the machine, it is as it is, they are exported as CSV files every mornin. If there is another way combining this files as one post action, im up for trying it.

1 ACCEPTED SOLUTION
johnbasha33
Solution Sage
Solution Sage

@radoslavov91 

Use Folder Path Parameter: Set up a parameter to dynamically select the folder path for the year 2024, reducing unnecessary data loading.

Combine Queries: Merge multiple query steps to minimize operations during data loading, like filtering .csv files and folders for 2024 together.

Reduce Data Cleaning: Only apply essential data cleaning steps to minimize processing time. Focus on transformations necessary for your analysis.

Query Folding: Utilize query folding to push transformation operations to the data source, leveraging its processing capabilities for faster data retrieval.

Incremental Loading: Implement incremental loading to only load new or updated data since the last refresh, reducing the amount of data loaded each time.

Data Compression: Optimize data compression settings to balance file size and performance, experimenting with different options.

Partitioning: If applicable, partition your data based on criteria like date to improve query performance by accessing relevant partitions only.

Data Model Simplification: Review your data model to remove unnecessary relationships or columns, simplifying it for better query performance and reduced memory usage.

Did I answer your question? Mark my post as a solution! Appreciate your Kudos !!



View solution in original post

1 REPLY 1
johnbasha33
Solution Sage
Solution Sage

@radoslavov91 

Use Folder Path Parameter: Set up a parameter to dynamically select the folder path for the year 2024, reducing unnecessary data loading.

Combine Queries: Merge multiple query steps to minimize operations during data loading, like filtering .csv files and folders for 2024 together.

Reduce Data Cleaning: Only apply essential data cleaning steps to minimize processing time. Focus on transformations necessary for your analysis.

Query Folding: Utilize query folding to push transformation operations to the data source, leveraging its processing capabilities for faster data retrieval.

Incremental Loading: Implement incremental loading to only load new or updated data since the last refresh, reducing the amount of data loaded each time.

Data Compression: Optimize data compression settings to balance file size and performance, experimenting with different options.

Partitioning: If applicable, partition your data based on criteria like date to improve query performance by accessing relevant partitions only.

Data Model Simplification: Review your data model to remove unnecessary relationships or columns, simplifying it for better query performance and reduced memory usage.

Did I answer your question? Mark my post as a solution! Appreciate your Kudos !!



Helpful resources

Announcements
RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

MayPowerBICarousel

Power BI Monthly Update - May 2024

Check out the May 2024 Power BI update to learn about new features.