Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi all
I created a column called "duplicates" which is a concat of 4 columns. My intention is to have removed those duplicates (more or equal to 2) and leave only those rows whose the "date created column" is the most recent one.
For example:
The above image, has 2 rows (duplicated) so it should leave only the one with the created = most recent and remove the oldest one.
Is that possible with Power Query?
I attach the pbix
https://1drv.ms/u/s!ApgeWwGTKtFdhy__p6oEYLV0NrTY?e=gebU8R
Thanks all!
Solved! Go to Solution.
You can download the file: HERE
________________________
If my answer was helpful, please consider Accept it as the solution to help the other members find it
Click on the Thumbs-Up icon if you like this reply 🙂
⭕ Subscribe and learn Power BI from these videos
⚪ Website ⚪ LinkedIn ⚪ PBI User Group
@o59393
Paste the below code in Blank Query and check the steps: (Change the file location)
let
Source = Excel.Workbook(File.Contents("C:\Users\Fowmy\OneDrive\BI\PBICommunity\Duplicates removed.xlsx"), null, true),
Sheet1_Sheet = Source{[Item="Sheet1",Kind="Sheet"]}[Data],
#"Promoted Headers" = Table.PromoteHeaders(Sheet1_Sheet, [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Duplicates", Int64.Type}, {"Date created", type datetime}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Duplicates"}, {{"Count", each List.Max([Date created]), type nullable datetime}, {"all", each _, type table [Facility ID=number, #"Especificación / Specification"=text, Year=number, #"ES-RQ-225 Parámetro / Parameter"=text, Q1=date, #"Resultado / Result (Q1)"=number, #"Cumple? / Complies?"=text, Q2=date, #"Resultado / Result (Q2)"=number, #"Cumple? / Complies?_1"=text, Q3=date, #"Resultado / Result (Q3)"=any, #"Cumple? / Complies?_2"=text, Q4=date, #"Resultado / Result (Q4)"=any, #"Cumple? / Complies?_3"=text, #"Especificación / Specification_4"=text, Q1_5=date, #"Resultado / Result (Q1)_6"=number, #"Cumple? / Complies?_7"=text, Q2_8=date, #"Resultado / Result (Q2)_9"=number, #"Cumple? / Complies?_10"=text, Q3_11=date, #"Resultado / Result (Q3)_12"=number, #"Cumple? / Complies?_13"=text, Q4_14=date, #"Resultado / Result (Q4)_15"=number, #"Cumple? / Complies?_16"=text, Duplicates=nullable number, Date modified=datetime, Date created=nullable datetime]}}),
#"Expanded all" = Table.ExpandTableColumn(#"Grouped Rows", "all", {"Facility ID", "Especificación / Specification", "Year", "ES-RQ-225 Parámetro / Parameter", "Q1", "Resultado / Result (Q1)", "Cumple? / Complies?", "Q2", "Resultado / Result (Q2)", "Cumple? / Complies?_1", "Q3", "Resultado / Result (Q3)", "Cumple? / Complies?_2", "Q4", "Resultado / Result (Q4)", "Cumple? / Complies?_3", "Especificación / Specification_4", "Q1_5", "Resultado / Result (Q1)_6", "Cumple? / Complies?_7", "Q2_8", "Resultado / Result (Q2)_9", "Cumple? / Complies?_10", "Q3_11", "Resultado / Result (Q3)_12", "Cumple? / Complies?_13", "Q4_14", "Resultado / Result (Q4)_15", "Cumple? / Complies?_16", "Duplicates", "Date modified", "Date created"}, {"Facility ID", "Especificación / Specification", "Year", "ES-RQ-225 Parámetro / Parameter", "Q1", "Resultado / Result (Q1)", "Cumple? / Complies?", "Q2", "Resultado / Result (Q2)", "Cumple? / Complies?_1", "Q3", "Resultado / Result (Q3)", "Cumple? / Complies?_2", "Q4", "Resultado / Result (Q4)", "Cumple? / Complies?_3", "Especificación / Specification_4", "Q1_5", "Resultado / Result (Q1)_6", "Cumple? / Complies?_7", "Q2_8", "Resultado / Result (Q2)_9", "Cumple? / Complies?_10", "Q3_11", "Resultado / Result (Q3)_12", "Cumple? / Complies?_13", "Q4_14", "Resultado / Result (Q4)_15", "Cumple? / Complies?_16", "Duplicates.1", "Date modified", "Date created"}),
#"Added Custom" = Table.AddColumn(#"Expanded all", "Custom", each if [Duplicates.1] = [Duplicates] and [Date created] = [Count] then "Keep" else "Remove"),
#"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([Custom] = "Keep"))
in
#"Filtered Rows"
________________________
If my answer was helpful, please consider Accept it as the solution to help the other members find it
Click on the Thumbs-Up icon if you like this reply 🙂
⭕ Subscribe and learn Power BI from these videos
⚪ Website ⚪ LinkedIn ⚪ PBI User Group
You can download the file: HERE
________________________
If my answer was helpful, please consider Accept it as the solution to help the other members find it
Click on the Thumbs-Up icon if you like this reply 🙂
⭕ Subscribe and learn Power BI from these videos
⚪ Website ⚪ LinkedIn ⚪ PBI User Group
Solved @Fowmy
Did a small change to your step to get the latest date per each concat row:
Thanks a mill!
Hi @Fowmy
I tested the solution but I detected 2 things:
Can you please filter in the data tab for example facility id "4917" and the parameter "pH", it showed me 2 rows but should be only one row:
The other thing is that if I filter the facility id column, I only see a few of them (some are gone):
For example the original file has the facility id "6423" but after all the steps in Power Query it doesnt appear.
Can you help me?
Thanks.
Hi @o59393 ,
Would you please refer to :
let
Source = Excel.Workbook(File.Contents("C:\Users\dedmond\Downloads\Duplicates removed.xlsx"), null, true),
Sheet1_Sheet = Source{[Item="Sheet1",Kind="Sheet"]}[Data],
#"Promoted Headers" = Table.PromoteHeaders(Sheet1_Sheet, [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Facility ID", Int64.Type}, {"Especificación / Specification", type text}, {"Year", Int64.Type}, {"ES-RQ-225 Parámetro / Parameter", type text}, {"Q1", type date}, {"Resultado / Result (Q1)", Int64.Type}, {"Cumple? / Complies?", type text}, {"Q2", type date}, {"Resultado / Result (Q2)", Int64.Type}, {"Cumple? / Complies?_1", type text}, {"Q3", type date}, {"Resultado / Result (Q3)", Int64.Type}, {"Cumple? / Complies?_2", type text}, {"Q4", type date}, {"Resultado / Result (Q4)", Int64.Type}, {"Cumple? / Complies?_3", type text}, {"Especificación / Specification_4", Int64.Type}, {"Q1_5", type date}, {"Resultado / Result (Q1)_6", Int64.Type}, {"Cumple? / Complies?_7", type text}, {"Q2_8", type date}, {"Resultado / Result (Q2)_9", Int64.Type}, {"Cumple? / Complies?_10", type text}, {"Q3_11", type date}, {"Resultado / Result (Q3)_12", Int64.Type}, {"Cumple? / Complies?_13", type text}, {"Q4_14", type date}, {"Resultado / Result (Q4)_15", Int64.Type}, {"Cumple? / Complies?_16", type text}, {"Duplicates", Int64.Type}, {"Date modified", type datetime}, {"Date created", type datetime}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Duplicates", "Facility ID"}, {{"AllRows", each _, type table [Facility ID=nullable number, #"Especificación / Specification"=nullable text, Year=nullable number, #"ES-RQ-225 Parámetro / Parameter"=nullable text, Q1=nullable date, #"Resultado / Result (Q1)"=nullable number, #"Cumple? / Complies?"=nullable text, Q2=nullable date, #"Resultado / Result (Q2)"=nullable number, #"Cumple? / Complies?_1"=nullable text, Q3=nullable date, #"Resultado / Result (Q3)"=nullable number, #"Cumple? / Complies?_2"=nullable text, Q4=nullable date, #"Resultado / Result (Q4)"=nullable number, #"Cumple? / Complies?_3"=nullable text, #"Especificación / Specification_4"=nullable number, Q1_5=nullable date, #"Resultado / Result (Q1)_6"=nullable number, #"Cumple? / Complies?_7"=nullable text, Q2_8=nullable date, #"Resultado / Result (Q2)_9"=nullable number, #"Cumple? / Complies?_10"=nullable text, Q3_11=nullable date, #"Resultado / Result (Q3)_12"=nullable number, #"Cumple? / Complies?_13"=nullable text, Q4_14=nullable date, #"Resultado / Result (Q4)_15"=nullable number, #"Cumple? / Complies?_16"=nullable text, Duplicates=nullable number, Date modified=nullable datetime, Date created=nullable datetime]}}),
RankFunction = (tabletorank as table) as table =>
let
SortRows = Table.Sort(tabletorank,{{"Date created", Order.Descending}}),
AddIndex = Table.AddIndexColumn(SortRows, "Rank", 1, 1)
in
AddIndex,
AddedRank = Table.TransformColumns(#"Grouped Rows", {"AllRows", each RankFunction(_)}),
#"Expanded AllRows" = Table.ExpandTableColumn(AddedRank, "AllRows", {"Facility ID", "Especificación / Specification", "Year", "ES-RQ-225 Parámetro / Parameter", "Q1", "Resultado / Result (Q1)", "Cumple? / Complies?", "Q2", "Resultado / Result (Q2)", "Cumple? / Complies?_1", "Q3", "Resultado / Result (Q3)", "Cumple? / Complies?_2", "Q4", "Resultado / Result (Q4)", "Cumple? / Complies?_3", "Especificación / Specification_4", "Q1_5", "Resultado / Result (Q1)_6", "Cumple? / Complies?_7", "Q2_8", "Resultado / Result (Q2)_9", "Cumple? / Complies?_10", "Q3_11", "Resultado / Result (Q3)_12", "Cumple? / Complies?_13", "Q4_14", "Resultado / Result (Q4)_15", "Cumple? / Complies?_16", "Duplicates", "Date modified", "Date created", "Rank"}, {"AllRows.Facility ID", "AllRows.Especificación / Specification", "AllRows.Year", "AllRows.ES-RQ-225 Parámetro / Parameter", "AllRows.Q1", "AllRows.Resultado / Result (Q1)", "AllRows.Cumple? / Complies?", "AllRows.Q2", "AllRows.Resultado / Result (Q2)", "AllRows.Cumple? / Complies?_1", "AllRows.Q3", "AllRows.Resultado / Result (Q3)", "AllRows.Cumple? / Complies?_2", "AllRows.Q4", "AllRows.Resultado / Result (Q4)", "AllRows.Cumple? / Complies?_3", "AllRows.Especificación / Specification_4", "AllRows.Q1_5", "AllRows.Resultado / Result (Q1)_6", "AllRows.Cumple? / Complies?_7", "AllRows.Q2_8", "AllRows.Resultado / Result (Q2)_9", "AllRows.Cumple? / Complies?_10", "AllRows.Q3_11", "AllRows.Resultado / Result (Q3)_12", "AllRows.Cumple? / Complies?_13", "AllRows.Q4_14", "AllRows.Resultado / Result (Q4)_15", "AllRows.Cumple? / Complies?_16", "AllRows.Duplicates", "AllRows.Date modified", "AllRows.Date created", "AllRows.Rank"}),
#"Duplicated Column" = Table.AddColumn(#"Expanded AllRows", "AllRows.Rank - Copy", each [AllRows.Rank], type any),
#"Removed Columns" = Table.RemoveColumns(#"Duplicated Column",{"AllRows.Rank - Copy"}),
#"Added Custom" = Table.AddColumn(#"Removed Columns", "Custom", each if[Duplicates]<2 then 1 else [AllRows.Rank]),
#"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([Custom] = 1)),
#"Removed Columns1" = Table.RemoveColumns(#"Filtered Rows",{"AllRows.Rank", "Custom"}),
#"Filtered Rows1" = Table.SelectRows(#"Removed Columns1", each true),
#"Removed Columns3" = Table.RemoveColumns(#"Filtered Rows1",{"AllRows.Facility ID"}),
#"Removed Columns2" = Table.RemoveColumns(#"Removed Columns3",{"AllRows.Duplicates"}),
#"Removed Duplicates" = Table.Distinct(#"Removed Columns2", {"Facility ID"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Duplicates",{{"AllRows.Especificación / Specification", "Especificación / Specification"}, {"AllRows.Year", "Year"}, {"AllRows.ES-RQ-225 Parámetro / Parameter", "ES-RQ-225 Parámetro / Parameter"}, {"AllRows.Q1", "Q1"}, {"AllRows.Resultado / Result (Q1)", "Resultado / Result (Q1)"}, {"AllRows.Cumple? / Complies?", "Cumple? / Complies?"}, {"AllRows.Q2", "Q2"}, {"AllRows.Resultado / Result (Q2)", "Resultado / Result (Q2)"}, {"AllRows.Cumple? / Complies?_1", "Cumple? / Complies?_1"}, {"AllRows.Q3", "Q3"}, {"AllRows.Resultado / Result (Q3)", "Resultado / Result (Q3)"}, {"AllRows.Cumple? / Complies?_2", "Cumple? / Complies?_2"}, {"AllRows.Q4", "Q4"}, {"AllRows.Resultado / Result (Q4)", "Resultado / Result (Q4)"}, {"AllRows.Cumple? / Complies?_3", "Cumple? / Complies?_3"}, {"AllRows.Especificación / Specification_4", "Especificación / Specification_4"}, {"AllRows.Q1_5", "Q1_5"}, {"AllRows.Resultado / Result (Q1)_6", "Resultado / Result (Q1)_6"}, {"AllRows.Cumple? / Complies?_7", "Cumple? / Complies?_7"}, {"AllRows.Q2_8", "Q2_8"}, {"AllRows.Resultado / Result (Q2)_9", "Resultado / Result (Q2)_9"}, {"AllRows.Cumple? / Complies?_10", "Cumple? / Complies?_10"}, {"AllRows.Q3_11", "Q3_11"}, {"AllRows.Resultado / Result (Q3)_12", "Resultado / Result (Q3)_12"}, {"AllRows.Cumple? / Complies?_13", "Cumple? / Complies?_13"}, {"AllRows.Q4_14", "Q4_14"}, {"AllRows.Resultado / Result (Q4)_15", "Resultado / Result (Q4)_15"}, {"AllRows.Cumple? / Complies?_16", "Cumple? / Complies?_16"}, {"AllRows.Date modified", "Date modified"}, {"AllRows.Date created", "Date created"}})
in
#"Renamed Columns"
For more details, please refer to the pbix file: https://qiuyunus-my.sharepoint.com/:u:/g/personal/pbipro_qiuyunus_onmicrosoft_com/EU5t7uwEdedFjUeLbG...
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Best Regards,
Dedmon Dai
@o59393
Can you share a sample of your data as csv or excel, the PBIX needs credentials.
________________________
If my answer was helpful, please consider Accept it as the solution to help the other members find it
Click on the Thumbs-Up icon if you like this reply 🙂
⭕ Subscribe and learn Power BI from these videos
⚪ Website ⚪ LinkedIn ⚪ PBI User Group
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
106 | |
94 | |
77 | |
65 | |
53 |
User | Count |
---|---|
145 | |
105 | |
104 | |
90 | |
63 |