cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
john_ach
Frequent Visitor

Incremental Refresh - Duplicates or Missing Updates

Hi,

 

I'm having an issue getting incremental refresh to operate as I'd expect, and am hoping you you might be able to help me out.

 

I am building a set of reports on data from our Microsoft Dynamics PSA CRM tool. The desire is to have these reports show data as close to live as possible, meaning a scheduled refresh every half hour. To achieve that I need to have the refresh duration comfortably under 30 minutes.

 

My architecture / data pipeline looks like this: Dynamics -> SQL DB Export -> Dataflow (Extract) -> Dataflow (Transform 1) -> Dataflow (Transform 2) -> Dataset -> Report

 

In order to achieve the short (and efficient) refresh I have incremental refresh set up on the Dataflow (Extract) entities. The tables coming from Dynamics all have two datetime fields, createdon and modifiedon, which reflect when the record was created, and last modified. I have tried a combination of incremental refresh settings with those date fields and have not had the behaviour I would expect.

 

Desired incremental refresh behaviour:

  • All data kept for all time (e.g. not just last 5 years)
  • Rows reflect the original data source exactly
  • Only rows which have changed since the previous refresh are updated


Tested settings and results:

 

[A] No incremental refresh:

  • Dataflow reflects database exactly - Good
  • Full refresh slow - Bad


[B] Filter on createdon, store for 100 years, refresh from past 1 days, detect data changes on modifiedon:

  • Dataflow has same number of rows as database - Good
  • Dataflow does not include changes made in database - Bad


[C] Filter on modifiedon, store for 100 years, refresh from past 1 days, detect data changes on modifiedon:

  • Dataflow has duplicate entries for rows modified since previous refresh - Bad
  • Subsequent remove duplicates step required in dataflow - Bad
  • All database changes included in dataflow - Good


[D] Filter on createdon, store for 100 years, refresh from past 1 days, detect data changes off:

  • Dataflow has same number of rows as database - Good
  • Dataflow does not include changes made in database - Bad


[E] Filter on modifiedon, store for 100 years, refresh from past 1 days, detect data changes off:

  • Dataflow has duplicate entries for rows modified since previous refresh - Bad
  • Subsequent remove duplicates step required in dataflow - Bad
  • All database changes included in dataflow - Good


This behaviour doesn't seem to be correct, according to my understanding of the intention of incremental refresh. Please could you help enlighten me to my mistake(s) and misunderstanding, and please give guidance how the right datetime fields to use.

 

Thanks for your help

3 REPLIES 3
john_ach
Frequent Visitor

Hi,

 

Any suggestions as to where I'm going wrong, or is this expected behaviour?

 

Thanks for any help you can give

john_ach
Frequent Visitor

Hi,

 

Yes, I would like only data that has changed to be refreshed, while reflecting the contents of the original database exactly (no duplicates, and reflecting all changes).

 

I have configured incremental refresh as per the linked article. It doesn't specify which datetime field to use, or how the behaviour would change by using different datetime fields.

 

The example given in that article uses refresh on the createdon datettime field. That is [D] from my tests (original post) and results in the dataflow not reflecting changes made to the source database. 

I have also tried with detect data changes on, via the modifiedon datetime field, which is [B] from my tests. That gives the same (incorrect) results.

 

Is this expected behaviour? And/or what settings should I be using?


Thanks for your help

v-yingjl
Community Support
Community Support

Hi @john_ach ,

Only data that's changed needs to be refreshed under incremental refresh.

You can follow this document to set incremental refesh for dataflows:

Configuring incremental refresh for dataflows 

 

Best Regards,
Community Support Team _ Yingjie Li
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Helpful resources

Announcements
MBAS on Demand

2021 Release Wave 2 Plan

Power Platform release plan for the 2021 release wave 2 describes all new features releasing from October 2021 through March 2022.

July 2021 Update 768x460.png

Check it out!

Click here to read more about the July 2021 Updates

Power Query PA Forum 768x460.png

Check it out!

Did you know that you can visit the Power Query Forum in Power BI and now Power Apps

Top Solution Authors
Top Kudoed Authors