Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
CR
Resolver II
Resolver II

remove duplicate - specific values

Hi!

 

I need help! I get a list of references. Each reference (column 1) has several revisions (column 2) so I get several lines for one reference:

  • line 1 => ref_1 | rev A
  • line 2 => ref_1 rev B
  • line 3 => ref_1 rev 0
  • line 4 => ref_1 rev 1
  • line 5 => ref_2 rev 0
  • line 6 => ref_2 rev 1
  • ...

Note: the order of the revision is A => B => C... => 0 => 1...

I'm trying to remove specific rows by using the remove duplicate option. The idea is to keep the last "revision" of each reference (in this specific case, the rev 1 so the line 4 and the line 6).

For that, I create a new column by merging references and revisions (column 3) and I've applied a sort (let's say Ascending):

  • line 1 => ref_1 | A | ref_1-A
  • line 2 => ref_1 B | ref_1-B
  • line 3 => ref_1 0 | ref_1-0
  • line 4 => ref_1 1 | ref_1-1
  • line 5 => ref_2 0 | ref_2-0
  • line 6 => ref_2 1 | ref_2-1

Once it's done, I made a right click on the column 1 and I select remove duplicate. Result: only the line 1 and the line 5 are displayed:

  • line 1 => ref_1 | A | ref_1-A
  • line 2 => ref_1 | B | ref_1-B
  • line 3 => ref_1 | 0 | ref_1-0
  • line 4 => ref_1 | 1 | ref_1-1
  • line 5 => ref_2 0 | ref_2-0
  • line 6 => ref_2 | 1 | ref_2-1

I tried also to apply a different sort (Descending) before applying the remove duplicate:

  • line 1 => ref_2 1 | ref_2-1
  • line 2 => ref_2 0 | ref_2-0
  • line 3 => ref_1 1 | ref_1-1
  • line 4 => ref_1 | 0 | ref_1-0
  • line 5 => ref_1 B | ref_1-B
  • line 6 => ref_1 | A | ref_1-A

But, the same result!

  • line 1 => ref_2 | 1 | ref_2-1
  • line 2 => ref_2 0 | ref_2-0
  • line 3 => ref_1 | 1 | ref_1-1
  • line 4 => ref_1 | 0 | ref_1-0
  • line 5 => ref_1 | B | ref_1-B
  • line 6 => ref_1 | A | ref_1-A

So, do you know how to solve it ? a trick ?

And more broadly, what is the criteria of the selection when we use the remove duplicate ?

 

Thanks for your attention.

 

Regards,

Camille

1 ACCEPTED SOLUTION

Dear Smoupre,

 

Thanks a lot, I used the buffering and it worked !

 

Before:

 

let
    ...
    #"Inserted Merged Column" = Table.AddColumn(#"Added Conditional Column", "DOC_REF", each Text.Combine({[Document Client Reference], Text.From([#"REV-NEW"], "fr-FR")}, "_"), type text),
    #"Sorted Rows" = Table.Sort(#"Inserted Merged Column",{{"DOC_REF", Order.Descending}}),
in
    ...

Now:

let
...
    #"Inserted Merged Column" = Table.AddColumn(#"Added Conditional Column", "DOC_REF", each Text.Combine({[Document Client Reference], Text.From([#"REV-NEW"], "fr-FR")}, "_"), type text),
    #"Sorted Rows" = Table.Sort(#"Inserted Merged Column",{{"DOC_REF", Order.Descending}}),
    #"Buffered" = Table.Buffer(#"Sorted Rows"),
    #"Removed Duplicates" = Table.Distinct(#"Buffered", {"Document Reference"})
in
    #"Removed Duplicates"

Thanks !

 

Regards,

CR

View solution in original post

3 REPLIES 3
DAX0110
Resolver V
Resolver V

Hi @CR, is it safe to assume that the revisions will appear in strict chronological order? 

If that's the case, and the "last revision" for each "ref_N" will always be the last row in its group,

then you can use this M script in Query Editor to retrieve them.

 

Before using this script, you need to process the table into the following form: (which you have probably already done)

 

line          ref            revision

line 1       ref_1        rev A

line 2       ref_1        rev B

line 3       ref_1        rev 0

line 4       ref_1        rev 1

line 5       ref_2        rev 0

line 6       ref_2        rev 1

 

The important points are:

(1) the 3 column names must be "line", "ref", and "revision"

(2) the table must be named "tbl" because my script depends on it as the input

 

Then, open the "Advanced Editor"  and add the following code (starting with "final = Table.Group")

to the end of the script:

 

let
    ...
    final = Table.Group(
        tbl
        , {"ref"}
        , {
            {"line", each Table.Last(_)[line], type text}
            , {"revision", each Table.Last(_)[revision], type text}
          }
        )
in
    final

 

This will output the last row of each "ref_N" group.

 

 

Greg_Deckler
Super User
Super User

Try buffering your table before your removal of duplicates.

 

https://social.technet.microsoft.com/Forums/en-US/00a09332-fe6e-47a4-a8a9-b71e38b01a86/how-to-remove...

 


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

Dear Smoupre,

 

Thanks a lot, I used the buffering and it worked !

 

Before:

 

let
    ...
    #"Inserted Merged Column" = Table.AddColumn(#"Added Conditional Column", "DOC_REF", each Text.Combine({[Document Client Reference], Text.From([#"REV-NEW"], "fr-FR")}, "_"), type text),
    #"Sorted Rows" = Table.Sort(#"Inserted Merged Column",{{"DOC_REF", Order.Descending}}),
in
    ...

Now:

let
...
    #"Inserted Merged Column" = Table.AddColumn(#"Added Conditional Column", "DOC_REF", each Text.Combine({[Document Client Reference], Text.From([#"REV-NEW"], "fr-FR")}, "_"), type text),
    #"Sorted Rows" = Table.Sort(#"Inserted Merged Column",{{"DOC_REF", Order.Descending}}),
    #"Buffered" = Table.Buffer(#"Sorted Rows"),
    #"Removed Duplicates" = Table.Distinct(#"Buffered", {"Document Reference"})
in
    #"Removed Duplicates"

Thanks !

 

Regards,

CR

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.