cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
petersmith Visitor
Visitor

Dataflow incremental refresh with primary key

Hi All,

 

Loving dataflows, but would love the ability to do an incremental refresh in more of an 'upsert' style. That is, maintain the uniqueness of a primary key within the dataset. Any ideas on how to do this? I assume it will look something like excuting some sort of M-style merge after the dataflow incremental refresh pulls in more data.


For refrence my table size is ~1 million rows.

9 REPLIES 9
Super User
Super User

Re: Dataflow incremental refresh with primary key

Hi there

As it currently stands you have to use a DateKey in order to do the incremental refreshing.

THis will allow you to possibly only refresh the latest day. And with only having 1 million rows in total, it should be really quick.

Did I answer your question? Mark my post as a solution!
"Proud to be a Datanaut!"
Power BI Blog
Highlighted
markive Regular Visitor
Regular Visitor

Re: Dataflow incremental refresh with primary key

I assume the meaning of the question is similar to my own..

 

I can use a Date field in my data to sync only the last day for example. But my 'Upsert' question is will it be able to replace existing data if the primary key shows it already exists?

 

Lets say I have 10 records, one record created each day. If a user of my system updates record 2 that was originally created 2 says ago but updated today, will the incremental refresh 'Upsert' it? Or will this break the refresh?

 

When I read about partitions and how incremental refresh seemed to work in original Analysis Services it seems to be purely additive for new data incrementally.

Super User
Super User

Re: Dataflow incremental refresh with primary key

Hi there

That would work as either by making the incremental refresh look at the DateTime column of the updated data.

Or you could look at the following below when setting it up

 

image.png


Did I answer your question? Mark my post as a solution!
"Proud to be a Datanaut!"
Power BI Blog
SamRock Regular Visitor
Regular Visitor

Re: Dataflow incremental refresh with primary key

I SO GLAD someone asked this question too!

Currently incremental refresh works for the data was updated based on datetime stamp, say last 10 days. It does not take into consideration the Primary Key that might have been inserted in the past.
Eg: lets say a "Sales Order" was created 3 months with a Status "Open", but only today its status was changed to "Shipped". Since I am only looking for past 10 days data for incremental refresh, a new Sales Order row gets inserted into the Dataset causing my "Sales Order" table to be duplicated.

How can we avoid this? Ideally along with the Datetime stamp, there needs to be a option where we specify the Unique Key column too. If incremental refresh contains any of the Unique Keys loaded in the past, they need to be deleted too and reinserted.

Super User
Super User

Re: Dataflow incremental refresh with primary key

Hi there

 

What you can do is to also enable the option for Detect Changes, where you can update the column based on a date, which will be included in the incremental process

 

image.png


Did I answer your question? Mark my post as a solution!
"Proud to be a Datanaut!"
Power BI Blog
SamRock Regular Visitor
Regular Visitor

Re: Dataflow incremental refresh with primary key

Hi @GilbertQ  Thanks. But this does not take care of my scenario. If only depends on Max date in "Changes detected". If my date column is not falling in the rangeof teh incremental refresh cycle , it will still miss the older rows. 

Let me know if I missed something

Super User
Super User

Re: Dataflow incremental refresh with primary key

Hi there

My understanding is that it will also look for changes to the changed column and if it is newer than the last date, it will process that data too?

Did I answer your question? Mark my post as a solution!
"Proud to be a Datanaut!"
Power BI Blog
SamRock Regular Visitor
Regular Visitor

Re: Dataflow incremental refresh with primary key

Yes it does.. but you will need another date column thats different from the Incremental refresh date.
Do you have any example or scenario that can help me understand how to use this option?

Super User
Super User

Re: Dataflow incremental refresh with primary key

Hi there

Here are more details on how it works

https://docs.microsoft.com/en-us/power-bi/service-dataflows-incremental-refresh

Did I answer your question? Mark my post as a solution!
"Proud to be a Datanaut!"
Power BI Blog

Helpful resources

Announcements
Can You Solve These Challenge

Challenge: Can You Solve These?

Find out how to participate in the first Power BI 'Can You Solve These?' challenge.

Community News & Announcements

Community News & Announcements

Get your latest community news and announcements.

Virtual Launch Event

Microsoft Business Applications October Virtual Launch Event

Join us for an in-depth look at the new innovations across Dynamics 365 and the Microsoft Power Platform.

Community Kudopalooza

Win Power BI Swag with Community Kudopalooza!

Each week, complete activities and be qualified in the drawing for cool Power BI Swag.

Users Online
Currently online: 118 members 1,452 guests
Please welcome our newest community members: