Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
mcaop-21
New Member

Dataflow vs Dataset Query Timing

I am in the process of moving queries from SQL and SAP HANA from a dataset to a dataflow to be consumed in other data sets. The M script are identical copies. They both refresh daily. The dataset refreshes in ~45 minutes. The dataflow however, takes ~2 hrs 30 minutes. When looking at the refresh details download from the service, there are 5 tables, 4 of which refresh in a few seconds and one that takes from 30-40 minutes. However the data refresh reports taking 2 hrs 23 minutes??!? Either the duration is reported incorrectly or the end time. 

mcaop21_0-1636404265905.png

Looking at teh value in Excel  in the duration cell, it shows 1:23:32 AM, so I am guessing this is really a 1 hr 23 min duration. NOw the question is why is the refresh on this query so much longer for the dataflow than a dataset?

Not to mention, once this dataflow is refreshed, the datasets that use the tables will need to be refreshed as well which just adds addtional time. I did not expect to see this large of a difference.

 

Any thoughts are greatly appreicated.

 

 

 

1 ACCEPTED SOLUTION
v-xiaoyan-msft
Community Support
Community Support

Hi @mcaop-21 ,

 

If your dataset refresh takes a long time because you have applied a set of heavy data transformations in Power Query, then what you can do instead is to push that set of heavy transformations to a dataflow. That way, the transformations happen on a different process, it loads the output into Azure Data Lake storage of Power BI service, and then you can use that output as the input of the Power BI dataset.

 

Instead of doing the heavy lifting work in Power BI, just push them all to dataflows, and your data refresh time in Power BI dataset would be super fast! That said, you still need to schedule the refresh of the dataflow in the service. However, because that can run on a different schedule than the Power BI dataset itself, then you don’t have to wait for the refresh to finish to get you development work done.

 

For more information,please refer to:How to Use Dataflow to Make the Refresh of Power BI Solution FASTER! 

 

Hope it helps,


Community Support Team _ Caitlyn

If this post helps then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

2 REPLIES 2
v-xiaoyan-msft
Community Support
Community Support

Hi @mcaop-21 ,

 

If your dataset refresh takes a long time because you have applied a set of heavy data transformations in Power Query, then what you can do instead is to push that set of heavy transformations to a dataflow. That way, the transformations happen on a different process, it loads the output into Azure Data Lake storage of Power BI service, and then you can use that output as the input of the Power BI dataset.

 

Instead of doing the heavy lifting work in Power BI, just push them all to dataflows, and your data refresh time in Power BI dataset would be super fast! That said, you still need to schedule the refresh of the dataflow in the service. However, because that can run on a different schedule than the Power BI dataset itself, then you don’t have to wait for the refresh to finish to get you development work done.

 

For more information,please refer to:How to Use Dataflow to Make the Refresh of Power BI Solution FASTER! 

 

Hope it helps,


Community Support Team _ Caitlyn

If this post helps then please consider Accept it as the solution to help the other members find it more quickly.

GilbertQ
Super User
Super User

Hi @mcaop-21 

 

The only thing I can think of is when using a dataflow it has to also store those files in Azure Blob storage as well as some other things in the background. While in the dataset it gets put straight into the tabular model.

 

I would suggest looking at using the enhanced compute engine if you are looking to bring a dataflow into a dataset for it to run faster?





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!







Power BI Blog

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors