Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Anonymous
Not applicable

remove duplicates based on certain criteria

Hi Everyone, 

 

I wondered if in power query it's possible to remove duplicate values based on certain criteria. My data would look like this:

 

 

IDClientStageFeeCountryurlSource
2ABC.win500FRabc.compipeline
3XXX  FRxxx.comintake
1MMM     
2ABCwin500 FR intake
2abc 100 abc.comintake
3XXXlost100NL pipeline

 

I want to remove the duplicate values based on ID but source = "pipeline". Is it possible to achieve this?

 

Any small help would be appreciated.

 

Thanks in advance.

1 ACCEPTED SOLUTION
v-lid-msft
Community Support
Community Support

Hi @Anonymous ,

 

We can add a step using Table.SelectRows function, please change #"Changed Type" to the name of your previous step.

 

 

RemoveDuplicate = Table.SelectRows(#"Changed Type",each let s=[Source] ,i = [ID] ,  totalSameIDNumber = Table.RowCount(Table.SelectRows(#"Changed Type",each [ID]=i)) in totalSameIDNumber=1 or s="pipeline")

 

 

9.jpg

 

All the queries are here:

 

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlLSUXJ0ctYDUuWZeUDS1MAASLoFAYnEpGS95PxcIKsgsyA1JzMvVSlWJ1rJGCgQEREBJBWgGKy6oqICqjozryQxG6LWEMj19fVFUovAIHmo9ci2ww1UQDXKCOIiqIQh2JkKKK5EUoxwY05+cQlcvZ8PVBPCQ7EA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, Client = _t, Stage = _t, Fee = _t, Country = _t, url = _t, Source = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Client", type text}, {"Stage", type text}, {"Fee", Int64.Type}, {"Country", type text}, {"url", type text}, {"Source", type text}}),
    RemoveDuplicate = Table.SelectRows(#"Changed Type",each let s=[Source] ,i = [ID] ,  totalSameIDNumber = Table.RowCount(Table.SelectRows(#"Changed Type",each [ID]=i)) in totalSameIDNumber=1 or s="pipeline")
in
    RemoveDuplicate

 


If it doesn't meet your requirement, Please show the exact expected result based on the Tables that you have shared.


Best regards,

Community Support Team _ Dong Li
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

2 REPLIES 2
v-lid-msft
Community Support
Community Support

Hi @Anonymous ,

 

We can add a step using Table.SelectRows function, please change #"Changed Type" to the name of your previous step.

 

 

RemoveDuplicate = Table.SelectRows(#"Changed Type",each let s=[Source] ,i = [ID] ,  totalSameIDNumber = Table.RowCount(Table.SelectRows(#"Changed Type",each [ID]=i)) in totalSameIDNumber=1 or s="pipeline")

 

 

9.jpg

 

All the queries are here:

 

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlLSUXJ0ctYDUuWZeUDS1MAASLoFAYnEpGS95PxcIKsgsyA1JzMvVSlWJ1rJGCgQEREBJBWgGKy6oqICqjozryQxG6LWEMj19fVFUovAIHmo9ci2ww1UQDXKCOIiqIQh2JkKKK5EUoxwY05+cQlcvZ8PVBPCQ7EA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, Client = _t, Stage = _t, Fee = _t, Country = _t, url = _t, Source = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Client", type text}, {"Stage", type text}, {"Fee", Int64.Type}, {"Country", type text}, {"url", type text}, {"Source", type text}}),
    RemoveDuplicate = Table.SelectRows(#"Changed Type",each let s=[Source] ,i = [ID] ,  totalSameIDNumber = Table.RowCount(Table.SelectRows(#"Changed Type",each [ID]=i)) in totalSameIDNumber=1 or s="pipeline")
in
    RemoveDuplicate

 


If it doesn't meet your requirement, Please show the exact expected result based on the Tables that you have shared.


Best regards,

Community Support Team _ Dong Li
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Anonymous
Not applicable

Wow... this works really well!!! thank you so much @v-lid-msft !!.

I kinda found out the solution my own way by grouping the table then merges it, but yours works way much cleaner!!! 

 

Thanks a lot!!

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors