cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Databeest
Advocate I
Advocate I

Incremental Refresh on a dataset with Dataflows as a data source

Hi,

 

I have a question about implementing Incremental Refresh on a dataset that connects solely to Dataflows. I tried setting it up but it appears Dataflows do not support query folding.

 

Few questions:

- Is it possible in any way to get Incremental Refresh working on a dataset that runs on Dataflows at this moment?

- If not, is there anything on the roadmap to enable this?

- If not, is there any way to refresh only part of a Dataflow using a workaround?

 

It feels really weird to have all the data in Azure Data Lake (Dataflows) but not being able to load it into a dataset due to memory issues.

31 REPLIES 31

Hi @yingyinr,

 

Any update on this?

Hey @yingyinr,

 

Would you happen to have an answer on the questions I asked? Or maybe a colleague could help you out?

 

Thanks so much in advance.

Hi @Databeest

Sorry for delay. There are two ways to configure incremental refresh: one is configure increment refresh in Power BI Desktop (it is available for Pro users) and another is use increment refresh with Power BI dataflows(currently only for Premium users). Currently you are connecting to dataflow as data source in Power BI Desktop, since the original data source is the dataflow, you can just configure increment refresh for the dataflow in Power BI Service. 

Best Regards

Rena

Community Support Team _ Rena
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Hi @yingyinr,

 

Thanks for your answer. It is absolutely clear to me that I can set up incremental refresh in dataflows. This is what I currently do. So the main benefit of this is that we only refresh the last 14 days of data each morning (approx. 2 million rows of data). This is working well and I'm happy with the feature.

 

However, our total dataset consists of approximately 70 million rows of data. And each morning, when I'm refreshing my dataset, it will refresh all of these 70 million rows of data. Even when I implement incremental refresh on the dataset, it will still refresh all rows. What I would expect it to do is refresh only the last 14 days of data, so that it only has to refresh 500 MB of data, instead of the entire ~7 GB.

 

How can I make sure that my dataset will only refresh the part that is refreshed in dataflows (the incremental part)?

Hi @Databeest ,

According to the official documentation, it is recommend to use data sources which support query folding when configure increment refresh. Otherwise, it may execute the complete refresh... You can check the following sentences:

Query folding

It's important the partition filters are pushed to the source system when queries are submitted for refresh operations. To push filtering down means the datasource should support query folding. Most data sources that support SQL queries support query folding. However, data sources like flat files, blobs, and web feeds typically do not. In cases where the filter is not supported by the datasource back-end, it cannot be pushed down. In such cases, the mashup engine compensates and applies the filter locally, which may require retrieving the full dataset from the data source. This can cause incremental refresh to be very slow, and the process can run out of resources either in the Power BI service or in the on-premises data gateway if used.



So please check if the original data sources using in dataflow support query folding or not. You can refer this video to check it.

In addition, hope the following documenations can help you.

Power BI Incremental Refresh and Query Folding

Power BI - Checking Query Folding with View Native Query

Best Regards

Rena

Community Support Team _ Rena
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Hi @yingyinr ,

 

Thanks again for your answer.

 

I do want to emphasize that I'm not trying to set up incremental refresh in Dataflows. I already got that working. I'm trying to set up incremental refresh in the dataset that connects to dataflows. So, because dataflows is the source, the question is if dataflows supports query folding. Does it?

Hi @Databeest ,

Yes, please check whether the underlying data source used in the dataflow support query folding or not. For the data sources that do not support query folding, you can set incremental refresh, but it may retrieve the entire data anyways. 

Best Regards

Rena

Community Support Team _ Rena
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

@yingyinr How is the underlying data source relevant? I'm connecting to Dataflows. The dataset doesn't care where I get the data from before that point, does it?

 

The question is, when I'm refreshing my dataset, do the queries it sends to dataflows get folded and is Incremental Refresh possible in this case? Or will it always load in the entire dataflow?

We have the very same question (dataset incremental refresh using dataflows as source, regardless of how the dataflow itself gets its data).

Has this been answered?

The simple answer is NO.

 

Many have tried workarounds, but nothing official from @microsoft.

ldacey
Frequent Visitor

@pat_mecee Have you tested this? I ran a test and it seemed to work, but I did not proceed (my responsibility is just to create the dataflow sources and I do not use PBI Desktop or datasets much).

 

It would be nice to have official communication here. 

 

https://docs.microsoft.com/en-us/power-query/dataflows/incremental-refresh#dataflow-incremental-refr...

 

"Dataflow incremental refresh and dataset incremental refresh are designed to work in tandem. It's acceptable and supported to have an incrementally refreshing entity in a dataflow, fully loaded into a dataset, or a fully loaded entity in a dataflow incrementally loaded to a dataset.

Both approaches work according to your specified definitions in the refresh settings."

Helpful resources

Announcements
PBI User Groups

Welcome to the User Group Public Preview

Check out new user group experience and if you are a leader please create your group

April Update

Check it Out!

Click here to read more about the April 2021 Updates!

secondImage

The largest Power BI virtual conference

100+ sessions, 100+ speakers, Product managers, MVPs, and experts. All about Power BI. Attend online or watch the recordings.

secondImage

Experience what’s next for Power BI

See the latest Power BI innovations, updates, and demos from the Microsoft Business Applications Launch Event.

Top Solution Authors
Top Kudoed Authors