Re: Incremental Refresh Error: Duplicates in prima...

soptim_vond · ‎01-29-2021

Hi folks,

I´m having the following problem with incremental refresh:

My model includes a table in which I create a unique key by concatinating a date column with an id column (which results in multiple rows in the key column) and then adding a step "remove duplicates" which (as expected) results in a unique key column.

However, as soon as I use incremental refresh (keep last 10 days and refresh / add 1 day) I get an error message saying that there are duplicates in this column.

Is it possible that the "remove duplicates" step is not performed on the whole data at the end (containing historical and refreshed data) but that it is actually performed once for historical data and once for new data, which are then appended resulting in remaining duplicates?

How would you suggest to prevent that from happening?

Please find below the m-code of the table in question with the relevant part highlighted in red.

Thanks a lot in advance,

Nico

M Code:

let

DatenAb=Text.From(Date.From(RangeStart)),
DatenBis=Text.From(Date.From(RangeEnd)),
Mandant="AbsatzPrognose",
Segment=Segmentfilter,

Quelle = Oracle.Database(Datenbank,
[Query=
"SELECT
mandant,
commodity,
vertragsnummer,
segment,
vertragsnehmer,
vertragsnehmer_knr,
anschlussnehmer,
anschlussnehmer_knr,
zp_id,
malo_id,
malo_zaehlpunktbezeichnung,
malo_marktlokation_ident,
rgzs_name,
land,
ort,
plz,
strasse,
haus_nr,
haus_nr_e,
zaehlverfahren,
ht_arbeit,
nt_arbeit,
kundenwert,
prog_param_entn_kurzfrist,
prog_param_entn_mittelfrist,
prog_param_entn_langfrist_norm,
prog_param_entn_langfrist_kalt,
prog_param_entn_langfrist_warm,
prog_param_eins_kurzfrist,
prog_param_eins_mittelfrist,
prog_param_eins_langfrist_norm,
prog_param_eins_langfrist_kalt,
prog_param_eins_langfrist_warm,
tag

FROM VIM_OWNER.V_INSIGHT_PROGNOSE_CHECK pp
WHERE pp.MANDANT = '"& Mandant &"'
AND pp.SEGMENT = '"& Segment &"'

AND pp.TAG >= to_date('"&DatenAb&"', 'dd.mm.yyyy')
AND pp.TAG < to_date('"&DatenBis&"', 'dd.mm.yyyy')
"
, HierarchicalNavigation=true]),
#"Gefilterte Zeilen" = Table.SelectRows(Quelle, each [TAG] >= RangeStart and [TAG] < RangeEnd),
#"Monatsbeginn eingefügt" = Table.AddColumn(#"Gefilterte Zeilen", "Monatsbeginn", each Date.StartOfMonth([TAG]), type datetime),
#"Geänderter Typ" = Table.TransformColumnTypes(#"Monatsbeginn eingefügt",{{"Monatsbeginn", type date}}),
#"Zusammengeführte Spalte eingefügt" = Table.AddColumn(#"Geänderter Typ", "Key_Malo_Monat", each Text.Combine({Text.From([MALO_ID]), Text.From([Monatsbeginn], "de-DE")}, ""), type text),
#"Entfernte Duplikate" = Table.Distinct(#"Zusammengeführte Spalte eingefügt", {"Key_Malo_Monat"})
in
#"Entfernte Duplikate"

jeroen_guns · ‎10-12-2022

Having the same issue. Error appears when there are multiple steps preformed in power query. With full load the dataset loads just fine. Another MS product that doesn't work as advertised.
Can thus only be implemented for tables with no extra power query steps. (Adding a remove duplicate(distinct) doesn't work.

lbendlin · ‎10-12-2022

Try this - Convert your power query to a Dataflow. You will notice that the range filters will automatically move to the end of the M chain, preventing you from adding steps after.

Incremental refresh works as designed. If that's a good design or not is indeed debatable.

lbendlin · ‎01-29-2021

Your issue is probably unrelated to incremental refresh. Run a fixed query (with hard coded values instead of RangeStart and RangeEnd) and you will likely see the same issue. It can also be related to null values in your index column

soptim_vond · ‎01-31-2021

Thanks for your suggestion, however that is exactly what I have done in order to rule out other causes. Refreshing the data for the same timespan without incremental refresh is completed successfully. Any other ideas?

Incremental Refresh Error: Duplicates in primary key (even after remove duplicates)

Helpful resources

Fabric certifications survey

Power BI Monthly Update - April 2024

Fabric Community Update - April 2024