Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
vicente89
New Member

What is the best way to work with a large amount of data with Gen1 Dataflows?

In my company we work with Dataflows Gen1, with a large volume of data, which makes us in some cases have some problems with data loading, as there are many data and we load them in Denodo, which has a maximum loading time (timeout) of 15 minutes, time that runs out in some cases.

To solve this problem we have made partitions by semesters to load the data without the volume of data being so large, but even so there are times when we get the timeout error because there is not enough time to load all the data.

How should we structure these dataflows so that we do not have these problems?
What is the best way to work with a large amount of data with Dataflows Gen1?

Clarification: we have to work with Gen1, since we are not allowed to enter Gen2 yet.

1 ACCEPTED SOLUTION
v-huijiey-msft
Community Support
Community Support

Hi @vicente89 ,

 

Thanks for the reply from sergej_og .

 

Instead of downloading the entire dataset at once, consider an incremental refresh strategy.

 

This allows new or changed data to be downloaded only after the last download, thereby greatly reducing the amount of data processed in each operation.

 

For instructions on setting up incremental refreshes, please see the Incremental refresh for semantic models and real-time data in Power BI - Power BI | Microsoft Learn documentation.

 

For maximum efficiency, review and optimize the source files.

 

For example:

 

1. Implementing Vertical Filtering

 

2. Preference for Custom columns created in Power Query

 

3. Disable "Auto date/time" in data load options settings

 

4. Use a subset of data for development or Horizontal Filtering

 

5. Try using more variables in DAX measures calculations

 

6. Disable Power Query query load for non-required tables

 

For more information on optimizing queries, please see:

Power BI Performance Optimization Tips (mssqltips.com)

Optimization guide for Power BI - Power BI | Microsoft Learn

 

If you have any other questions please feel free to contact me.

 

Best Regards,
Yang
Community Support Team

 

If there is any post helps, then please consider Accept it as the solution  to help the other members find it more quickly.
If I misunderstand your needs or you still have problems on it, please feel free to let us know. Thanks a lot!

View solution in original post

2 REPLIES 2
v-huijiey-msft
Community Support
Community Support

Hi @vicente89 ,

 

Thanks for the reply from sergej_og .

 

Instead of downloading the entire dataset at once, consider an incremental refresh strategy.

 

This allows new or changed data to be downloaded only after the last download, thereby greatly reducing the amount of data processed in each operation.

 

For instructions on setting up incremental refreshes, please see the Incremental refresh for semantic models and real-time data in Power BI - Power BI | Microsoft Learn documentation.

 

For maximum efficiency, review and optimize the source files.

 

For example:

 

1. Implementing Vertical Filtering

 

2. Preference for Custom columns created in Power Query

 

3. Disable "Auto date/time" in data load options settings

 

4. Use a subset of data for development or Horizontal Filtering

 

5. Try using more variables in DAX measures calculations

 

6. Disable Power Query query load for non-required tables

 

For more information on optimizing queries, please see:

Power BI Performance Optimization Tips (mssqltips.com)

Optimization guide for Power BI - Power BI | Microsoft Learn

 

If you have any other questions please feel free to contact me.

 

Best Regards,
Yang
Community Support Team

 

If there is any post helps, then please consider Accept it as the solution  to help the other members find it more quickly.
If I misunderstand your needs or you still have problems on it, please feel free to let us know. Thanks a lot!

sergej_og
Super User
Super User

Hey @vicente89 ,
what is meant by "large amount of data"? How many rows/columns?
Your limitation of data load is on Denodo-site, right?

I didn`t catched your issue. Which one is the funnel - Dataflow or Denodo in your eyes?

Regards

Helpful resources

Announcements
LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors