Solved: Re: Find the Max Date by Year for Multiple IDs Usi...

MSargeant · ‎01-19-2023

Hi All,

I am looking for a way to filter or trim a data set in Power Query that has dates from multiple years for over 1k unique IDs. I need to keep the rows with the most recent recorded date for each year, for each unique ID.

Thanks!

Below are a data sample and expected result:

Sample
Index	ID	SD	Band	Recorded Date	Year
16	ABD3100170220.AL35	1	Good	20/08/2019	2019
2094	ABD3100170220.AL35	0.8	Good	23/06/2020	2020
3073	ABD3100170220.AL35	0.8	Good	10/11/2020	2020
6126	ABD3100170220.AL35	0.8	Good	20/07/2021	2021
7118	ABD3100170220.AL35	0.8	Good	19/10/2021	2021
7992	ABD3100170220.AL35	null	No Data	20/09/2022	2022
21	ABD31001711001320.AL35	1.2	Good	06/11/2018	2018
2099	ABD31001711001320.AL35	1.2	Good	23/06/2020	2020
3078	ABD31001711001320.AL35	1.2	Good	10/11/2020	2020
6131	ABD31001711001320.AL35	1.2	Good	20/07/2021	2021
7123	ABD31001711001320.AL35	1.1	Good	19/10/2021	2021
7997	ABD31001711001320.AL35	2.9	Satisfactory	20/09/2022	2022
22	ABD31001713201540.AL35	0.9	Good	06/11/2018	2018
2100	ABD31001713201540.AL35	0.9	Good	23/06/2020	2020
3079	ABD31001713201540.AL35	0.9	Good	10/11/2020	2020
6132	ABD31001713201540.AL35	0.8	Good	20/07/2021	2021
7124	ABD31001713201540.AL35	0.8	Good	19/10/2021	2021
7998	ABD31001713201540.AL35	1.1	Good	20/09/2022	2022
23	ABD31001715400.AL35	1.4	Good	06/11/2018	2018
2101	ABD31001715400.AL35	1.9	Good	23/06/2020	2020
3080	ABD31001715400.AL35	1.9	Good	10/11/2020	2020
6133	ABD31001715400.AL35	2.1	Good	20/07/2021	2021
7125	ABD31001715400.AL35	1.7	Good	19/10/2021	2021
7999	ABD31001715400.AL35	1.1	Good	20/09/2022	2022
17	ABD310017220440.AL35	1.2	Good	19/02/2019	2019
2095	ABD310017220440.AL35	0.7	Good	23/06/2020	2020
3074	ABD310017220440.AL35	0.7	Good	10/11/2020	2020
6127	ABD310017220440.AL35	0.7	Good	20/07/2021	2021
7119	ABD310017220440.AL35	1.3	Good	19/10/2021	2021
7993	ABD310017220440.AL35	null	No Data	20/09/2022	2022

MSargeant · ‎01-19-2023

Hi,

Thanks for your response. However, I don't have duplicate rows in my data - those have already been removed. I need the most recent recording date per year (2018, 2019, 2020, 2021 2022) for each unique ID. Each unique ID should be left with the most recent recording date for every year which would result in 4-5 rows per unique ID.

View solution in original post

chris88 · ‎01-19-2023

Hi,

I think your best approach would be using the Table.Buffer function.

First sort the column newest to oldest, then create a custom step using Table.Buffer, then remove duplicates while selecting the year and ID columns together.

Similar example below

let   
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Employee Number", Int64.Type}, {"Employee Name", type text}, {"Date Of Joining"", type datetime}, {"Annual Salary", Int64.Type}, {"Additional Comments", type text}}),
 #"Sorted Rows" = Table.Sort(#"Changed Type",{{"Date Of Joining", Order.Descending}}),
  #"Buffered" = Table.Buffer(#"Sorted Rows"),
 #"Removed Duplicates" = Table.Distinct(#"Buffered", {"Employee Number"})
in
 #"Removed Duplicates"

MSargeant · ‎01-19-2023

Hi,

Thanks for your response. However, I don't have duplicate rows in my data - those have already been removed. I need the most recent recording date per year (2018, 2019, 2020, 2021 2022) for each unique ID. Each unique ID should be left with the most recent recording date for every year which would result in 4-5 rows per unique ID.

chris88 · ‎01-19-2023

Yes, by highlighting ID and Year and removing duplicates, it will only remove rows where the Year and ID are the same. By using sort, then table buffer before doing this then it will keep the latest date for each ID in each year.

removing duplicates does not necessarily remove duplicate rows. It removes duplicate instances of the column(s) selected. While your dataset doesn't have duplicate rows, it does have rows where both the year and the ID are the same.

MSargeant · ‎01-20-2023

Many thanks for the additional explanation. The solution worked well.

Thank you!

Find the Max Date by Year for Multiple IDs Using Two Columns with Multiple Years in Power Query

Helpful resources

Microsoft Fabric Learn Together

Power BI Monthly Update - April 2024

Fabric Community Update - April 2024

How to Get Your Question Answered Quickly