Solved: PowerQuery - Remove Duplicated lines based on crit...

ngct1112 · ‎01-13-2022

Hi all, I would like to remove lines with duplicated "ID" and keep the lowest "price" line. May I know if there any quick way(not using groupby) I could do for this case?

Original Table

ID	Price
1	4
1	346
3	27
4	458
5	734
5	6
7	45
7	85
7	7

Desired outcome:

ID	Price
1	4
3	27
4	458
5	6
7	7

Appreciated if any help

smpa01 · ‎01-13-2022

@ngct1112 try this

CT = Table.TransformColumnTypes(#"Promoted Headers", {{"ID", Int64.Type}, {"Price", Int64.Type}}), 
  #"Filtered Rows" = Table.SelectRows(
    CT, 
    each ([Price] = List.Min(Table.SelectRows(CT, (q) => q[ID] = [ID])[Price]))
  )

Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

My custom visualization projects

Plotting Live Sound: Viz1

Beautiful News:Viz1, Viz2, Viz3

Visual Capitalist: Working Hrs

Others:Easing Graph, Animated Calendar

View solution in original post

freginier · ‎01-13-2022

Easier way, add sorted rows, buffer and remove duplicate step

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUTI0UIrVgTDNwSwnIMsSzjI0UoqNBQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", Int64.Type}}),
    #"Sorted Rows" = Table.Sort(#"Changed Type",{{"Column2", Order.Ascending}}),
    #"Buffered" = Table.Buffer(#"Sorted Rows"),
    #"Removed Duplicates" = Table.Distinct(Buffered, {"Column1"})
in
    #"Removed Duplicates"

View solution in original post

freginier · ‎01-13-2022

Easier way, add sorted rows, buffer and remove duplicate step

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUTI0UIrVgTDNwSwnIMsSzjI0UoqNBQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", Int64.Type}}),
    #"Sorted Rows" = Table.Sort(#"Changed Type",{{"Column2", Order.Ascending}}),
    #"Buffered" = Table.Buffer(#"Sorted Rows"),
    #"Removed Duplicates" = Table.Distinct(Buffered, {"Column1"})
in
    #"Removed Duplicates"

ngct1112 · ‎01-13-2022

@smpa01 it works perfectly. Thanks for your solution!

smpa01 · ‎01-13-2022

@ngct1112 try this

CT = Table.TransformColumnTypes(#"Promoted Headers", {{"ID", Int64.Type}, {"Price", Int64.Type}}), 
  #"Filtered Rows" = Table.SelectRows(
    CT, 
    each ([Price] = List.Min(Table.SelectRows(CT, (q) => q[ID] = [ID])[Price]))
  )

Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

My custom visualization projects

Plotting Live Sound: Viz1

Beautiful News:Viz1, Viz2, Viz3

Visual Capitalist: Working Hrs

Others:Easing Graph, Animated Calendar

AlexisOlson · ‎01-13-2022

This feels like a DAX sort of approach. In M, you want to be very careful about doing table filtering like this since the O(n^2) complexity will make it impractical for larger datasets.

In this particular case, we can do this with a simple Group By:

= Table.Group(#"Changed Type", {"ID"}, {{"Price", each List.Min([Price]), type nullable number}})

smpa01 · ‎01-13-2022

@AlexisOlson This feels like a DAX sort of approach - yes it is 😀 I was simply testing out how can I avoid groupby (OP wants to avoid groupby)and still get there and translate DAX to M

Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

My custom visualization projects

Plotting Live Sound: Viz1

Beautiful News:Viz1, Viz2, Viz3

Visual Capitalist: Working Hrs

Others:Easing Graph, Animated Calendar

AlexisOlson · ‎01-13-2022

Ah, this makes much more sense now. I must have skimmed over that particular requirement.

Group by is the perfect tool for this, so I was rather confused about why no one suggested it. Turns out I need to learn how to read.

ngct1112 · ‎01-13-2022

@smpa01 it works perfectly. Thanks for your solution!

PowerQuery - Remove Duplicated lines based on criteria

Helpful resources

Europe’s largest Microsoft Fabric Community Conference

New forum boards available in Real-Time Intelligence.

Power BI Monthly Update - May 2024

Jumpstart your career with the Fabric Career Hub

PowerQuery - Remove Duplicated lines based on criteria

Helpful resources

Europe’s largest Microsoft Fabric Community Conference

New forum boards available in Real-Time Intelligence.

Power BI Monthly Update - May 2024