Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
IF
Post Prodigy
Post Prodigy

remove duplicates based on a filter

Hi,

In the power query, I want to remove the dublicates if the ID number is the same. If the ID is repeated, it should keep the Classification = B. For example, the original Table is:

IDNameDateStartTypeDaysQtyClassificationDateClose

4340058P111.15.2022M2231C12.07.2022
4340052P111.14.2022M1121C11.29.2022
4340051P111.14.2022M211A 
4340050P110.13.2022M4311A 
4340038P211.15.2022M3111C11.26.2022
4340022P211.14.2022M721C11.16.2022
4340021P210.14.2022M2451B 
4340030P211.13.2022M4311C12.01.2022
4340021P210.14.2022M2451A 
4340050P110.13.2022M4311B 

 

But the red rows since the ID is repeated. I want to keep the classification B if the ID is repeated. In this case,  I want to change it to:

IDNameDateStartTypeDaysQtyClassificationDateClose

4340058P111.15.2022M2231C12.07.2022
4340052P111.14.2022M1121C11.29.2022
4340051P111.14.2022M211A 
4340038P211.15.2022M3111C11.26.2022
4340022P211.14.2022M721C11.16.2022
4340021P210.14.2022M2451B 
4340030P211.13.2022M4311C12.01.2022
4340050P110.13.2022M4311B 

 

Can you help me?

Best

 

1 ACCEPTED SOLUTION

Hi @IF ,

 

I apologise! There was a typo in the M code...

 

I updated the solution above. Feel free to try it once more 🙂

 

/Tom
https://www.tackytech.blog/
https://www.instagram.com/tackytechtom/



Did I answer your question➡️ Please, mark my post as a solution ✔️

Also happily accepting Kudos 🙂

Feel free to connect with me on LinkedIn! linkedIn

#proudtobeasuperuser 

View solution in original post

4 REPLIES 4
tackytechtom
Super User
Super User

Hi @IF ,

 

How about this:

 

Before:

tackytechtom_1-1670439583750.png

 

After:

tackytechtom_0-1670869833282.png

 

 

Here the code in Power Query M that you can paste into the advanced editor (if you do not know, how to exactly do this, please check out this quick walkthrough)

tackytechtom_2-1670439768187.png

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("nZExDsIwDEWvgjJXlr+dUBiBuVL3qve/BnFoTEjp0sGO7OTJ/zvLEqJG5nQLQ5iRE0BIJCySi8mSqLVzvOwU4vFzvQ4VlhaODpeOtDBI7j2M/3DlLB45Lg3CjjBBHYk56QGjxaDsDBZv6DReO43lqcNfjaNt55fFjoWz3PuLaYOfnVhu5x0YrL+BMwPPbHQTub4B", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, Name = _t, DateStart = _t, Type = _t, Days = _t, Qty = _t, Classification = _t, DateClose = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Name", type text}, {"DateStart", type text}, {"Type", type text}, {"Days", Int64.Type}, {"Qty", Int64.Type}, {"Classification", type text}, {"DateClose", type text}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"ID"}, {{"AllRows", each _, type table [ID=nullable number, Name=nullable text, DateStart=nullable text, Type=nullable text, Days=nullable number, Qty=nullable number, Classification=nullable text, DateClose=nullable text]}, {"CountRows", each Table.RowCount(_), Int64.Type}}),
    #"Filtered Rows 1" = Table.SelectRows(#"Grouped Rows", each ([CountRows] > 1)),
    #"Expanded AllRows 1" = Table.ExpandTableColumn(#"Filtered Rows 1", "AllRows", {"Name", "DateStart", "Type", "Days", "Qty", "Classification", "DateClose"}, {"AllRows.Name", "AllRows.DateStart", "AllRows.Type", "AllRows.Days", "AllRows.Qty", "AllRows.Classification", "AllRows.DateClose"}),
    #"Filtered Rows 2" = Table.SelectRows(#"Expanded AllRows 1", each ([AllRows.Classification] = "B")),
    #"Filtered Rows 3" = Table.SelectRows(#"Grouped Rows", each ([CountRows] = 1)),
    #"Expanded AllRows 2" = Table.ExpandTableColumn(#"Filtered Rows 3", "AllRows", {"Name", "DateStart", "Type", "Days", "Qty", "Classification", "DateClose"}, {"AllRows.Name", "AllRows.DateStart", "AllRows.Type", "AllRows.Days", "AllRows.Qty", "AllRows.Classification", "AllRows.DateClose"}),
    #"Appended Query" = Table.Combine({#"Expanded AllRows 2", #"Filtered Rows 2"}),
    #"Renamed Columns" = Table.RenameColumns(#"Appended Query",{{"AllRows.Name", "Name"}, {"AllRows.DateStart", "DateStart"}, {"AllRows.Type", "Type"}, {"AllRows.Days", "Days"}, {"AllRows.Qty", "Qty"}, {"AllRows.Classification", "Classification"}, {"AllRows.DateClose", "DateClose"}}),
    #"Removed Columns" = Table.RemoveColumns(#"Renamed Columns",{"CountRows"})
in
    #"Removed Columns"

 

The idea is to do a group by and filter for the IDs that have more than one row. For these rows, we filter for classification = B. Then we append that resultset with all rows that just have one row in the group by. In the end it's a bit of cleaning like renaming columns and removing the count (support) column.

 

I'd suggest to paste in the M code in a new query and check out the steps on the right hand side.

 

Let me know if you have questions 🙂

 

/Tom
https://www.tackytech.blog/
https://www.instagram.com/tackytechtom/



Did I answer your question➡️ Please, mark my post as a solution ✔️

Also happily accepting Kudos 🙂

Feel free to connect with me on LinkedIn! linkedIn

#proudtobeasuperuser 

Hello,

Thanks for the help. I have a question about the start date. I didn't understand why the result is missing the start date data in some cases.

 

Hi @IF ,

 

I apologise! There was a typo in the M code...

 

I updated the solution above. Feel free to try it once more 🙂

 

/Tom
https://www.tackytech.blog/
https://www.instagram.com/tackytechtom/



Did I answer your question➡️ Please, mark my post as a solution ✔️

Also happily accepting Kudos 🙂

Feel free to connect with me on LinkedIn! linkedIn

#proudtobeasuperuser 

Perfect!

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors