Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
o59393
Post Prodigy
Post Prodigy

How to leave the most recent date on duplicate rows with Power Query

Hi all

 

I created a column called "duplicates" which is a concat of 4 columns. My intention is to have removed those duplicates (more or equal to 2) and leave only those rows whose the "date created column" is the most recent one.

 

For example:

 

dfsddsf3.JPG

 

The above image, has 2 rows (duplicated) so it should leave only the one with the created = most recent and remove the oldest one.

 

Is that possible with Power Query?

 

I attach the pbix

 

https://1drv.ms/u/s!ApgeWwGTKtFdhy__p6oEYLV0NrTY?e=gebU8R

 

Thanks all!

1 ACCEPTED SOLUTION

@o59393 

You can download the file: HERE

________________________

If my answer was helpful, please consider Accept it as the solution to help the other members find it

Click on the Thumbs-Up icon if you like this reply 🙂

YouTube  LinkedIn

Did I answer your question? Mark my post as a solution! and hit thumbs up


Subscribe and learn Power BI from these videos

Website LinkedIn PBI User Group

View solution in original post

9 REPLIES 9
Fowmy
Super User
Super User

@o59393 

Paste the below code in Blank Query and check the steps: (Change the file location)

let
    Source = Excel.Workbook(File.Contents("C:\Users\Fowmy\OneDrive\BI\PBICommunity\Duplicates removed.xlsx"), null, true),
    Sheet1_Sheet = Source{[Item="Sheet1",Kind="Sheet"]}[Data],
    #"Promoted Headers" = Table.PromoteHeaders(Sheet1_Sheet, [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Duplicates", Int64.Type}, {"Date created", type datetime}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"Duplicates"}, {{"Count", each List.Max([Date created]), type nullable datetime}, {"all", each _, type table [Facility ID=number, #"Especificación / Specification"=text, Year=number, #"ES-RQ-225 Parámetro / Parameter"=text, Q1=date, #"Resultado / Result (Q1)"=number, #"Cumple? / Complies?"=text, Q2=date, #"Resultado / Result (Q2)"=number, #"Cumple? / Complies?_1"=text, Q3=date, #"Resultado / Result (Q3)"=any, #"Cumple? / Complies?_2"=text, Q4=date, #"Resultado / Result (Q4)"=any, #"Cumple? / Complies?_3"=text, #"Especificación / Specification_4"=text, Q1_5=date, #"Resultado / Result (Q1)_6"=number, #"Cumple? / Complies?_7"=text, Q2_8=date, #"Resultado / Result (Q2)_9"=number, #"Cumple? / Complies?_10"=text, Q3_11=date, #"Resultado / Result (Q3)_12"=number, #"Cumple? / Complies?_13"=text, Q4_14=date, #"Resultado / Result (Q4)_15"=number, #"Cumple? / Complies?_16"=text, Duplicates=nullable number, Date modified=datetime, Date created=nullable datetime]}}),
    #"Expanded all" = Table.ExpandTableColumn(#"Grouped Rows", "all", {"Facility ID", "Especificación / Specification", "Year", "ES-RQ-225 Parámetro / Parameter", "Q1", "Resultado / Result (Q1)", "Cumple? / Complies?", "Q2", "Resultado / Result (Q2)", "Cumple? / Complies?_1", "Q3", "Resultado / Result (Q3)", "Cumple? / Complies?_2", "Q4", "Resultado / Result (Q4)", "Cumple? / Complies?_3", "Especificación / Specification_4", "Q1_5", "Resultado / Result (Q1)_6", "Cumple? / Complies?_7", "Q2_8", "Resultado / Result (Q2)_9", "Cumple? / Complies?_10", "Q3_11", "Resultado / Result (Q3)_12", "Cumple? / Complies?_13", "Q4_14", "Resultado / Result (Q4)_15", "Cumple? / Complies?_16", "Duplicates", "Date modified", "Date created"}, {"Facility ID", "Especificación / Specification", "Year", "ES-RQ-225 Parámetro / Parameter", "Q1", "Resultado / Result (Q1)", "Cumple? / Complies?", "Q2", "Resultado / Result (Q2)", "Cumple? / Complies?_1", "Q3", "Resultado / Result (Q3)", "Cumple? / Complies?_2", "Q4", "Resultado / Result (Q4)", "Cumple? / Complies?_3", "Especificación / Specification_4", "Q1_5", "Resultado / Result (Q1)_6", "Cumple? / Complies?_7", "Q2_8", "Resultado / Result (Q2)_9", "Cumple? / Complies?_10", "Q3_11", "Resultado / Result (Q3)_12", "Cumple? / Complies?_13", "Q4_14", "Resultado / Result (Q4)_15", "Cumple? / Complies?_16", "Duplicates.1", "Date modified", "Date created"}),
    #"Added Custom" = Table.AddColumn(#"Expanded all", "Custom", each if [Duplicates.1] = [Duplicates] and [Date created] = [Count] then "Keep" else "Remove"),
    #"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([Custom] = "Keep"))
in
    #"Filtered Rows"

Fowmy_0-1600716301322.png

________________________

If my answer was helpful, please consider Accept it as the solution to help the other members find it

Click on the Thumbs-Up icon if you like this reply 🙂

YouTube  LinkedIn

 



 

Did I answer your question? Mark my post as a solution! and hit thumbs up


Subscribe and learn Power BI from these videos

Website LinkedIn PBI User Group

Hi @Fowmy 


Can you please share the pbix for me to see the applied steps you did?

 

Thanks a lot 🙂

@o59393 

You can download the file: HERE

________________________

If my answer was helpful, please consider Accept it as the solution to help the other members find it

Click on the Thumbs-Up icon if you like this reply 🙂

YouTube  LinkedIn

Did I answer your question? Mark my post as a solution! and hit thumbs up


Subscribe and learn Power BI from these videos

Website LinkedIn PBI User Group

Solved @Fowmy 

 

Did a small change to your step to get the latest date per each concat row:

 

45lklk.JPG

 

Thanks a mill!

Hi @Fowmy 

 

I tested the solution but I detected 2 things:


Can you please filter in the data tab for example facility id "4917" and the parameter "pH", it showed me 2 rows but should be only one row:

 

retert44.JPG

 

The other thing is that if I filter the facility id column, I only see a few of them (some are gone):

 

For example the original file has the facility id "6423" but after all the steps in Power Query it doesnt appear.

 

Can you help me?

 

Thanks. 

Hi @o59393 ,

 

Would you please refer to :

 

 

let
    Source = Excel.Workbook(File.Contents("C:\Users\dedmond\Downloads\Duplicates removed.xlsx"), null, true),
    Sheet1_Sheet = Source{[Item="Sheet1",Kind="Sheet"]}[Data],
    #"Promoted Headers" = Table.PromoteHeaders(Sheet1_Sheet, [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Facility ID", Int64.Type}, {"Especificación / Specification", type text}, {"Year", Int64.Type}, {"ES-RQ-225 Parámetro / Parameter", type text}, {"Q1", type date}, {"Resultado / Result (Q1)", Int64.Type}, {"Cumple? / Complies?", type text}, {"Q2", type date}, {"Resultado / Result (Q2)", Int64.Type}, {"Cumple? / Complies?_1", type text}, {"Q3", type date}, {"Resultado / Result (Q3)", Int64.Type}, {"Cumple? / Complies?_2", type text}, {"Q4", type date}, {"Resultado / Result (Q4)", Int64.Type}, {"Cumple? / Complies?_3", type text}, {"Especificación / Specification_4", Int64.Type}, {"Q1_5", type date}, {"Resultado / Result (Q1)_6", Int64.Type}, {"Cumple? / Complies?_7", type text}, {"Q2_8", type date}, {"Resultado / Result (Q2)_9", Int64.Type}, {"Cumple? / Complies?_10", type text}, {"Q3_11", type date}, {"Resultado / Result (Q3)_12", Int64.Type}, {"Cumple? / Complies?_13", type text}, {"Q4_14", type date}, {"Resultado / Result (Q4)_15", Int64.Type}, {"Cumple? / Complies?_16", type text}, {"Duplicates", Int64.Type}, {"Date modified", type datetime}, {"Date created", type datetime}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"Duplicates", "Facility ID"}, {{"AllRows", each _, type table [Facility ID=nullable number, #"Especificación / Specification"=nullable text, Year=nullable number, #"ES-RQ-225 Parámetro / Parameter"=nullable text, Q1=nullable date, #"Resultado / Result (Q1)"=nullable number, #"Cumple? / Complies?"=nullable text, Q2=nullable date, #"Resultado / Result (Q2)"=nullable number, #"Cumple? / Complies?_1"=nullable text, Q3=nullable date, #"Resultado / Result (Q3)"=nullable number, #"Cumple? / Complies?_2"=nullable text, Q4=nullable date, #"Resultado / Result (Q4)"=nullable number, #"Cumple? / Complies?_3"=nullable text, #"Especificación / Specification_4"=nullable number, Q1_5=nullable date, #"Resultado / Result (Q1)_6"=nullable number, #"Cumple? / Complies?_7"=nullable text, Q2_8=nullable date, #"Resultado / Result (Q2)_9"=nullable number, #"Cumple? / Complies?_10"=nullable text, Q3_11=nullable date, #"Resultado / Result (Q3)_12"=nullable number, #"Cumple? / Complies?_13"=nullable text, Q4_14=nullable date, #"Resultado / Result (Q4)_15"=nullable number, #"Cumple? / Complies?_16"=nullable text, Duplicates=nullable number, Date modified=nullable datetime, Date created=nullable datetime]}}),
    RankFunction = (tabletorank as table) as table =>
     let
      SortRows = Table.Sort(tabletorank,{{"Date created", Order.Descending}}),
      AddIndex = Table.AddIndexColumn(SortRows, "Rank", 1, 1)
     in
      AddIndex,
    AddedRank = Table.TransformColumns(#"Grouped Rows", {"AllRows", each RankFunction(_)}),
    #"Expanded AllRows" = Table.ExpandTableColumn(AddedRank, "AllRows", {"Facility ID", "Especificación / Specification", "Year", "ES-RQ-225 Parámetro / Parameter", "Q1", "Resultado / Result (Q1)", "Cumple? / Complies?", "Q2", "Resultado / Result (Q2)", "Cumple? / Complies?_1", "Q3", "Resultado / Result (Q3)", "Cumple? / Complies?_2", "Q4", "Resultado / Result (Q4)", "Cumple? / Complies?_3", "Especificación / Specification_4", "Q1_5", "Resultado / Result (Q1)_6", "Cumple? / Complies?_7", "Q2_8", "Resultado / Result (Q2)_9", "Cumple? / Complies?_10", "Q3_11", "Resultado / Result (Q3)_12", "Cumple? / Complies?_13", "Q4_14", "Resultado / Result (Q4)_15", "Cumple? / Complies?_16", "Duplicates", "Date modified", "Date created", "Rank"}, {"AllRows.Facility ID", "AllRows.Especificación / Specification", "AllRows.Year", "AllRows.ES-RQ-225 Parámetro / Parameter", "AllRows.Q1", "AllRows.Resultado / Result (Q1)", "AllRows.Cumple? / Complies?", "AllRows.Q2", "AllRows.Resultado / Result (Q2)", "AllRows.Cumple? / Complies?_1", "AllRows.Q3", "AllRows.Resultado / Result (Q3)", "AllRows.Cumple? / Complies?_2", "AllRows.Q4", "AllRows.Resultado / Result (Q4)", "AllRows.Cumple? / Complies?_3", "AllRows.Especificación / Specification_4", "AllRows.Q1_5", "AllRows.Resultado / Result (Q1)_6", "AllRows.Cumple? / Complies?_7", "AllRows.Q2_8", "AllRows.Resultado / Result (Q2)_9", "AllRows.Cumple? / Complies?_10", "AllRows.Q3_11", "AllRows.Resultado / Result (Q3)_12", "AllRows.Cumple? / Complies?_13", "AllRows.Q4_14", "AllRows.Resultado / Result (Q4)_15", "AllRows.Cumple? / Complies?_16", "AllRows.Duplicates", "AllRows.Date modified", "AllRows.Date created", "AllRows.Rank"}),
    #"Duplicated Column" = Table.AddColumn(#"Expanded AllRows", "AllRows.Rank - Copy", each [AllRows.Rank], type any),
    #"Removed Columns" = Table.RemoveColumns(#"Duplicated Column",{"AllRows.Rank - Copy"}),
    #"Added Custom" = Table.AddColumn(#"Removed Columns", "Custom", each if[Duplicates]<2 then 1 else [AllRows.Rank]),
    #"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([Custom] = 1)),
    #"Removed Columns1" = Table.RemoveColumns(#"Filtered Rows",{"AllRows.Rank", "Custom"}),
    #"Filtered Rows1" = Table.SelectRows(#"Removed Columns1", each true),
    #"Removed Columns3" = Table.RemoveColumns(#"Filtered Rows1",{"AllRows.Facility ID"}),
    #"Removed Columns2" = Table.RemoveColumns(#"Removed Columns3",{"AllRows.Duplicates"}),
    #"Removed Duplicates" = Table.Distinct(#"Removed Columns2", {"Facility ID"}),
    #"Renamed Columns" = Table.RenameColumns(#"Removed Duplicates",{{"AllRows.Especificación / Specification", "Especificación / Specification"}, {"AllRows.Year", "Year"}, {"AllRows.ES-RQ-225 Parámetro / Parameter", "ES-RQ-225 Parámetro / Parameter"}, {"AllRows.Q1", "Q1"}, {"AllRows.Resultado / Result (Q1)", "Resultado / Result (Q1)"}, {"AllRows.Cumple? / Complies?", "Cumple? / Complies?"}, {"AllRows.Q2", "Q2"}, {"AllRows.Resultado / Result (Q2)", "Resultado / Result (Q2)"}, {"AllRows.Cumple? / Complies?_1", "Cumple? / Complies?_1"}, {"AllRows.Q3", "Q3"}, {"AllRows.Resultado / Result (Q3)", "Resultado / Result (Q3)"}, {"AllRows.Cumple? / Complies?_2", "Cumple? / Complies?_2"}, {"AllRows.Q4", "Q4"}, {"AllRows.Resultado / Result (Q4)", "Resultado / Result (Q4)"}, {"AllRows.Cumple? / Complies?_3", "Cumple? / Complies?_3"}, {"AllRows.Especificación / Specification_4", "Especificación / Specification_4"}, {"AllRows.Q1_5", "Q1_5"}, {"AllRows.Resultado / Result (Q1)_6", "Resultado / Result (Q1)_6"}, {"AllRows.Cumple? / Complies?_7", "Cumple? / Complies?_7"}, {"AllRows.Q2_8", "Q2_8"}, {"AllRows.Resultado / Result (Q2)_9", "Resultado / Result (Q2)_9"}, {"AllRows.Cumple? / Complies?_10", "Cumple? / Complies?_10"}, {"AllRows.Q3_11", "Q3_11"}, {"AllRows.Resultado / Result (Q3)_12", "Resultado / Result (Q3)_12"}, {"AllRows.Cumple? / Complies?_13", "Cumple? / Complies?_13"}, {"AllRows.Q4_14", "Q4_14"}, {"AllRows.Resultado / Result (Q4)_15", "Resultado / Result (Q4)_15"}, {"AllRows.Cumple? / Complies?_16", "Cumple? / Complies?_16"}, {"AllRows.Date modified", "Date modified"}, {"AllRows.Date created", "Date created"}})
in
    #"Renamed Columns"

 

 

For more details, please refer to the pbix file: https://qiuyunus-my.sharepoint.com/:u:/g/personal/pbipro_qiuyunus_onmicrosoft_com/EU5t7uwEdedFjUeLbG...

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

 

Best Regards,

Dedmon Dai

Thank you! @v-deddai1-msft 

Fowmy
Super User
Super User

@o59393 

Can you share a sample of your data as csv or excel, the PBIX needs credentials.

 

________________________

If my answer was helpful, please consider Accept it as the solution to help the other members find it

Click on the Thumbs-Up icon if you like this reply 🙂

YouTube  LinkedIn

Did I answer your question? Mark my post as a solution! and hit thumbs up


Subscribe and learn Power BI from these videos

Website LinkedIn PBI User Group

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.