cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
DSwezey
Helper II
Helper II

Remove duplicates based on Multiple Columns

Hi there,

 

I have a table with 3 columns in it. Staff, Staff Position, Client.

Some staff members can have multiple positions with the same client. Some clients can have multiple staff members assigned to it.

See example below:

DSwezey_0-1641839435221.png

I want to filter out Manager positions where there are duplicate Staff & Client. Example above would be Bob. He is both a manager and partner for Google so I want to keep just 1 record of this and delete the manager row. 

 

Not sure how to go about this.

DSwezey_1-1641839514589.png

 

1 ACCEPTED SOLUTION
v-jingzhang
Community Support
Community Support

Hi @DSwezey 

 

You could try this method

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WcspPUtJR8k3MS0xPLQKy3PPz03NSlWJ1YFIBiUUleehSzhlFmcUo+hxzE6vy88CSjjmpFShyPvnlqcVgqaDKxDwUM6FSsQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Staff = _t, #"Staff Position" = _t, Client = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Staff", type text}, {"Staff Position", type text}, {"Client", type text}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"Staff", "Client"}, {{"Count", each Table.RowCount(_), Int64.Type}, {"AllData", each _, type table [Staff=nullable text, Staff Position=nullable text, Client=nullable text]}}),
    #"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom", each if [Count] > 1 then Table.SelectRows([AllData], each [Staff Position] <> "Manager") else [AllData]),
    #"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Custom"}),
    #"Expanded Custom" = Table.ExpandTableColumn(#"Removed Other Columns", "Custom", {"Staff", "Staff Position", "Client"}, {"Staff", "Staff Position", "Client"})
in
    #"Expanded Custom"

vjingzhang_0-1642068371606.png

 

Best Regards,
Community Support Team _ Jing
If this post helps, please Accept it as Solution to help other members find it.

View solution in original post

4 REPLIES 4
v-jingzhang
Community Support
Community Support

Hi @DSwezey 

 

You could try this method

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WcspPUtJR8k3MS0xPLQKy3PPz03NSlWJ1YFIBiUUleehSzhlFmcUo+hxzE6vy88CSjjmpFShyPvnlqcVgqaDKxDwUM6FSsQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Staff = _t, #"Staff Position" = _t, Client = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Staff", type text}, {"Staff Position", type text}, {"Client", type text}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"Staff", "Client"}, {{"Count", each Table.RowCount(_), Int64.Type}, {"AllData", each _, type table [Staff=nullable text, Staff Position=nullable text, Client=nullable text]}}),
    #"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom", each if [Count] > 1 then Table.SelectRows([AllData], each [Staff Position] <> "Manager") else [AllData]),
    #"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Custom"}),
    #"Expanded Custom" = Table.ExpandTableColumn(#"Removed Other Columns", "Custom", {"Staff", "Staff Position", "Client"}, {"Staff", "Staff Position", "Client"})
in
    #"Expanded Custom"

vjingzhang_0-1642068371606.png

 

Best Regards,
Community Support Team _ Jing
If this post helps, please Accept it as Solution to help other members find it.

ronrsnfld
Solution Sage
Solution Sage

To remove the duplicates, just select the columns you want to consider, and remove rows=>remove duplicates.  This method will retain the first in the list.

 

If you want control over which gets deleted, first sort your list appropriately so that the priority is at the top.

 

In your simplified example it would be a Z=>A sort of the staff position column, but I suspect your real data may be more complex.

The issue with that is There can be multiple clients for different Staff members. For example, Lowes is assigned to Alex and Ryan. I do not want to filter out any of that data even though Lowes is a duplicate. I want to filter out duplicate clients for Managers when the Staff is also a duplicate (i.e. Bob)

Did you actually try what I suggested?

It works fine on the data you provided and produces the output you show.

 

Please communicate exactly what you did, so I can assess the problem.

 

Helpful resources

Announcements
Microsoft Build 768x460.png

Microsoft Build is May 24-26. Have you registered yet?

Come together to explore latest innovations in code and application development—and gain insights from experts from around the world.

charticulator_carousel_with_text (1).png

Charticulator Design Challenge

Put your data visualization and design skills to the test! This exciting challenge is happening now through May 31st!

May UG Leader Call Carousel 768x460.png

What difference can a User Group make for you?

At the monthly call, connect with other leaders and find out how community makes your experience even better.

Power BI Release May 2022 768x460.png

Check it out!

Click here to read more about the May 2022 updates!