Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi,
In the power query, I want to remove the dublicates if the ID number is the same. If the ID is repeated, it should keep the Classification = B. For example, the original Table is:
IDNameDateStartTypeDaysQtyClassificationDateClose
4340058 | P1 | 11.15.2022 | M2 | 23 | 1 | C | 12.07.2022 |
4340052 | P1 | 11.14.2022 | M1 | 12 | 1 | C | 11.29.2022 |
4340051 | P1 | 11.14.2022 | M2 | 1 | 1 | A | |
4340050 | P1 | 10.13.2022 | M4 | 31 | 1 | A | |
4340038 | P2 | 11.15.2022 | M3 | 11 | 1 | C | 11.26.2022 |
4340022 | P2 | 11.14.2022 | M7 | 2 | 1 | C | 11.16.2022 |
4340021 | P2 | 10.14.2022 | M2 | 45 | 1 | B | |
4340030 | P2 | 11.13.2022 | M4 | 31 | 1 | C | 12.01.2022 |
4340021 | P2 | 10.14.2022 | M2 | 45 | 1 | A | |
4340050 | P1 | 10.13.2022 | M4 | 31 | 1 | B |
But the red rows since the ID is repeated. I want to keep the classification B if the ID is repeated. In this case, I want to change it to:
IDNameDateStartTypeDaysQtyClassificationDateClose
4340058 | P1 | 11.15.2022 | M2 | 23 | 1 | C | 12.07.2022 |
4340052 | P1 | 11.14.2022 | M1 | 12 | 1 | C | 11.29.2022 |
4340051 | P1 | 11.14.2022 | M2 | 1 | 1 | A | |
4340038 | P2 | 11.15.2022 | M3 | 11 | 1 | C | 11.26.2022 |
4340022 | P2 | 11.14.2022 | M7 | 2 | 1 | C | 11.16.2022 |
4340021 | P2 | 10.14.2022 | M2 | 45 | 1 | B | |
4340030 | P2 | 11.13.2022 | M4 | 31 | 1 | C | 12.01.2022 |
4340050 | P1 | 10.13.2022 | M4 | 31 | 1 | B |
Can you help me?
Best
Solved! Go to Solution.
Hi @IF ,
I apologise! There was a typo in the M code...
I updated the solution above. Feel free to try it once more 🙂
/Tom
https://www.tackytech.blog/
https://www.instagram.com/tackytechtom/
Did I answer your question❓➡️ Please, mark my post as a solution ✔️ |
Also happily accepting Kudos 🙂 |
Feel free to connect with me on LinkedIn! |
#proudtobeasuperuser |
Hi @IF ,
How about this:
Before:
After:
Here the code in Power Query M that you can paste into the advanced editor (if you do not know, how to exactly do this, please check out this quick walkthrough)
let Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("nZExDsIwDEWvgjJXlr+dUBiBuVL3qve/BnFoTEjp0sGO7OTJ/zvLEqJG5nQLQ5iRE0BIJCySi8mSqLVzvOwU4vFzvQ4VlhaODpeOtDBI7j2M/3DlLB45Lg3CjjBBHYk56QGjxaDsDBZv6DReO43lqcNfjaNt55fFjoWz3PuLaYOfnVhu5x0YrL+BMwPPbHQTub4B", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, Name = _t, DateStart = _t, Type = _t, Days = _t, Qty = _t, Classification = _t, DateClose = _t]), #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Name", type text}, {"DateStart", type text}, {"Type", type text}, {"Days", Int64.Type}, {"Qty", Int64.Type}, {"Classification", type text}, {"DateClose", type text}}), #"Grouped Rows" = Table.Group(#"Changed Type", {"ID"}, {{"AllRows", each _, type table [ID=nullable number, Name=nullable text, DateStart=nullable text, Type=nullable text, Days=nullable number, Qty=nullable number, Classification=nullable text, DateClose=nullable text]}, {"CountRows", each Table.RowCount(_), Int64.Type}}), #"Filtered Rows 1" = Table.SelectRows(#"Grouped Rows", each ([CountRows] > 1)), #"Expanded AllRows 1" = Table.ExpandTableColumn(#"Filtered Rows 1", "AllRows", {"Name", "DateStart", "Type", "Days", "Qty", "Classification", "DateClose"}, {"AllRows.Name", "AllRows.DateStart", "AllRows.Type", "AllRows.Days", "AllRows.Qty", "AllRows.Classification", "AllRows.DateClose"}), #"Filtered Rows 2" = Table.SelectRows(#"Expanded AllRows 1", each ([AllRows.Classification] = "B")), #"Filtered Rows 3" = Table.SelectRows(#"Grouped Rows", each ([CountRows] = 1)), #"Expanded AllRows 2" = Table.ExpandTableColumn(#"Filtered Rows 3", "AllRows", {"Name", "DateStart", "Type", "Days", "Qty", "Classification", "DateClose"}, {"AllRows.Name", "AllRows.DateStart", "AllRows.Type", "AllRows.Days", "AllRows.Qty", "AllRows.Classification", "AllRows.DateClose"}), #"Appended Query" = Table.Combine({#"Expanded AllRows 2", #"Filtered Rows 2"}), #"Renamed Columns" = Table.RenameColumns(#"Appended Query",{{"AllRows.Name", "Name"}, {"AllRows.DateStart", "DateStart"}, {"AllRows.Type", "Type"}, {"AllRows.Days", "Days"}, {"AllRows.Qty", "Qty"}, {"AllRows.Classification", "Classification"}, {"AllRows.DateClose", "DateClose"}}), #"Removed Columns" = Table.RemoveColumns(#"Renamed Columns",{"CountRows"}) in #"Removed Columns"
The idea is to do a group by and filter for the IDs that have more than one row. For these rows, we filter for classification = B. Then we append that resultset with all rows that just have one row in the group by. In the end it's a bit of cleaning like renaming columns and removing the count (support) column.
I'd suggest to paste in the M code in a new query and check out the steps on the right hand side.
Let me know if you have questions 🙂
/Tom
https://www.tackytech.blog/
https://www.instagram.com/tackytechtom/
Did I answer your question❓➡️ Please, mark my post as a solution ✔️ |
Also happily accepting Kudos 🙂 |
Feel free to connect with me on LinkedIn! |
#proudtobeasuperuser |
Hello,
Thanks for the help. I have a question about the start date. I didn't understand why the result is missing the start date data in some cases.
Hi @IF ,
I apologise! There was a typo in the M code...
I updated the solution above. Feel free to try it once more 🙂
/Tom
https://www.tackytech.blog/
https://www.instagram.com/tackytechtom/
Did I answer your question❓➡️ Please, mark my post as a solution ✔️ |
Also happily accepting Kudos 🙂 |
Feel free to connect with me on LinkedIn! |
#proudtobeasuperuser |
Perfect!
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.