Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
rabih
Frequent Visitor

Keeping latest record of duplicate values when duplicates are Odd and removing them if they are even

I have the same ticket number that might be duplicated a couple of times because of a Credit/Rebill situation. So if it's the wrong tax applied or wrong "Chassis number" entered, we credit the Original Sales Order and we rebill using the same ticket number.  But at times the Rebill is entered wrong, so it's credited and Rebilled again bringing the number of records for the same ticket to 5. In short, keep the last record if number of duplicates is odd and remove all the records if count of duplicates is even.

 

RecID SO#    CustNo   Ticket #    Amount Due    SO Origin    Chassis #

1        SO1         1          222              $222.22         Original     ACB123  

2        SO1         1          222            -$222.22          Credit       ACB123   =>Wrong Chassis # so Credit/Rebill

3        SO1         1          222             $222.22           Rebill       ABC123    (In this case I want to keep this record for Analysis)

 

4        SO4         1          333                 $60               Original    PPQ555

5        SO4         1          333               -$60               Credit        PPQ555 => Wrong customer selected so Credit (Remove both recs)

 

6        SO5         4          333                 60                 Original    PPQ555

 

I need to delete RecID 1 and 2 and keep 3 (Count of Duplicates for same ticket is 3) and I need to remove RecID 4 and 5 because their count is 2. Can this be achieved in Power BI?

 

 

1 ACCEPTED SOLUTION
Anonymous
Not applicable

Edit - Please name the 2nd table "RecordsToKeep".

 

The 3rd table (called "Data") references the 2nd table by name, so make sure it has the right name!

View solution in original post

3 REPLIES 3
diegoadum
Helper I
Helper I

The easiest solution is to add an index field and make a grouping using your key fields but using the MAX function in the grouping based on the index field. This will guarantee that you will always keep the last record of the group. Them, you can use the result table as a filter (via merge) with the original table to get all the fields again.

Anonymous
Not applicable

Yes, this is feasible using Query Editor.  However, I can't guarantee what the performance will be, that will be based on how large the dataset is.  You may need to work with your IT group to do some heavy lifting in SQL to generate that table for you, and then pull it in with Power BI directly.

 

That said, here's what I did.  I created 3 separate tables to do this.  The first is all of the data with no transformations happening, and I called it "Stage":

 

let
    Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"RecID", Int64.Type}, {"SO#", type text}, {"CustNo", Int64.Type}, {"Ticket #", Int64.Type}, {"Amount Due", type number}, {"SO Origin", type text}, {"Chassis #", type text}})
in
    #"Changed Type"

This will obviously be different for you, as you're connecting to the source differently.  I built the source data in Excel.  Adjust this table as needed.

 

Then, you'll create a table that only lists the records that you want to keep:

let
    Source = Stage,
    #"Grouped Rows" = Table.Group(Source, {"SO#"}, {{"Count", each Table.RowCount(_), type number}, {"Last RecID", each List.Max([RecID]), type number}}),
    #"Add DeleteFlag" = Table.AddColumn(#"Grouped Rows", "DeleteFlag", each if Number.IsEven([Count]) then "Delete" else "Keep", type text),
    #"Filtered Rows" = Table.SelectRows(#"Add DeleteFlag", each ([DeleteFlag] = "Keep")),
    #"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Count", "DeleteFlag"})
in
    #"Removed Columns"

It starts with the Stage table.  Then it groups that table by SO#, and creates 2 new columns:  How many rows that SO# has, and how what the last RecID number is.

DeleteFlag is a new column that is added.  It checks if the row count is even and says "Delete", otherwise it says "Keep".

We then filter the table to show only records with "Keep" in the DeleteFlag column.

Then we remove the Count and DeleteFlag columns, keeping only the SO# and RecID.

 

The last table uses that RecordsToKeep table to say which rows from the Stage table to keep:

 

let
    Source = Stage,
    #"Merged Queries" = Table.NestedJoin(Source,{"SO#", "RecID"},RecordsToKeep,{"SO#", "Last RecID"},"RecordsToKeep",JoinKind.Inner),
    #"Removed Columns" = Table.RemoveColumns(#"Merged Queries",{"RecordsToKeep"})
in
    #"Removed Columns"

We merge on both SO# and RecID.  The JoinKind.Inner makes sure we only keep rows that appear in both tables.

We then remove that nested join Column.  We don't need it, but it needs to be created to get us the join we needed.

 

That should do it for you!

 

PS - If the RecID is always a unique value in your table, then you change the last table to this:

let
    Source = Stage,
    #"Merged Queries" = Table.NestedJoin(Source,{"RecID"},RecordsToKeep,{"Last RecID"},"RecordsToKeep",JoinKind.Inner),
    #"Removed Columns" = Table.RemoveColumns(#"Merged Queries",{"RecordsToKeep"})
in
    #"Removed Columns"

Essentially, you're only asking to join on the RecID.  That should improve performance, and I'd recommend doing it that way if you can.

Anonymous
Not applicable

Edit - Please name the 2nd table "RecordsToKeep".

 

The 3rd table (called "Data") references the 2nd table by name, so make sure it has the right name!

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.