Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.
Hi, I want to get rid of duplicate rows from a table that meets the below condition.
Given the below table where ticker is the stock symbol, fund is the name of the pooled money funding, bought is how many shares of ticker, and person is the buyer:
Requirement: I want to get rid of duplicate rows if the Ticker and Person column is the same other rows. Of the duplicate rows, I want to keep the row that has the highest qty/number in the Bought column. Example of duplicate rows are highlighted, where yellow row is to be deleted and green is to be kept as that row has the highest qty/number in the Bought column.
Expected outcome is the below after removing the highlighted duplicates:
Solved! Go to Solution.
Hi @woofwoof123 ,
First of all, many thanks to @ryan_mayu for your very quick and effective replies.
Based on my testing, please try the following methods as workaround:
1.Create the simple table.
2.Create the index column in power query editor.
3.Create the calculated column to rank.
Rank =
VAR _ticket = 'Table'[Ticket]
VAR _person = 'Table'[Person]
RETURN
RANKX(FILTER(ALL('Table'),'Table'[Ticket] = _ticket && 'Table'[Person] = _person),[Index],,ASC)
4.Create the calculated column to flag the column.
Flag =
VAR _ticket = 'Table'[Ticket]
VAR _person = 'Table'[Person]
var _maxbought = MAXX(FILTER(ALL('Table'), 'Table'[Ticket] = _ticket && 'Table'[Person] = _person),[Bought])
var _count = COUNTX(FILTER(ALL('Table'),'Table'[Ticket] = _ticket && 'Table'[Person] = _person),[Ticket])
return
IF(
_count = 1 && [Rank]=1,
1,
IF(
_count > 1 && 'Table'[Bought] = _maxbought,1,0)
)
5.Create the new calculated table.
Table 2 =
var _table1 = FILTER('Table',[Flag]=1)
return
SUMMARIZE(
_table1,[Ticket],[Fund],[Bought],[Person])
6.The result is shown below.
Best Regards,
Wisdom Wu
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
you can also do that in PQ
let
Source = Table,
Custom1 = Table.AddColumn(Source,"Rank",each Table.RowCount(Table.SelectRows(Source,(x)=>x[Bought]>[Bought] and x[Ticket]=[Ticket] and x[Person]=[Person]))+1),
#"Filtered Rows" = Table.SelectRows(Custom1, each ([Rank] = 1)),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Rank"})
in
#"Removed Columns"
pls see the attachment below
Proud to be a Super User!
Hi,
This M code works
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("fY8/D4IwFMS/S2eW/qHOlTJoaGKEjTDU2MiA1Fii4dv7WkKpMXG5N/xy9+7aFglxqlCGpL2BUgIirvqOuiyiQk+g2KO9vaSk7o15wOUsT6GS3mXfoISCHLWz4w/i3iNcP5g5sMK6afNhyuAo/VygLJN3O+5TpX6ZwA7n2B/TtL8qGxH7c+YD5TzoMYWLjzD+B+b8a0RTV1tsGChW34rChpxF1H0A", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Ticket = _t, Fund = _t, Bought = _t, Person = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Ticket", type text}, {"Fund", type text}, {"Bought", Int64.Type}, {"Person", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Ticket", "Person"}, {{"All", each Table.Max(_,"Bought")}}),
#"Expanded All" = Table.ExpandRecordColumn(#"Grouped Rows", "All", {"Fund", "Bought"}, {"Fund", "Bought"})
in
#"Expanded All"
Hope this helps.
Hi @woofwoof123 ,
First of all, many thanks to @ryan_mayu for your very quick and effective replies.
Based on my testing, please try the following methods as workaround:
1.Create the simple table.
2.Create the index column in power query editor.
3.Create the calculated column to rank.
Rank =
VAR _ticket = 'Table'[Ticket]
VAR _person = 'Table'[Person]
RETURN
RANKX(FILTER(ALL('Table'),'Table'[Ticket] = _ticket && 'Table'[Person] = _person),[Index],,ASC)
4.Create the calculated column to flag the column.
Flag =
VAR _ticket = 'Table'[Ticket]
VAR _person = 'Table'[Person]
var _maxbought = MAXX(FILTER(ALL('Table'), 'Table'[Ticket] = _ticket && 'Table'[Person] = _person),[Bought])
var _count = COUNTX(FILTER(ALL('Table'),'Table'[Ticket] = _ticket && 'Table'[Person] = _person),[Ticket])
return
IF(
_count = 1 && [Rank]=1,
1,
IF(
_count > 1 && 'Table'[Bought] = _maxbought,1,0)
)
5.Create the new calculated table.
Table 2 =
var _table1 = FILTER('Table',[Flag]=1)
return
SUMMARIZE(
_table1,[Ticket],[Fund],[Bought],[Person])
6.The result is shown below.
Best Regards,
Wisdom Wu
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Thanks Wisdom I will try this out
Hi,
This M code works
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("fY8/D4IwFMS/S2eW/qHOlTJoaGKEjTDU2MiA1Fii4dv7WkKpMXG5N/xy9+7aFglxqlCGpL2BUgIirvqOuiyiQk+g2KO9vaSk7o15wOUsT6GS3mXfoISCHLWz4w/i3iNcP5g5sMK6afNhyuAo/VygLJN3O+5TpX6ZwA7n2B/TtL8qGxH7c+YD5TzoMYWLjzD+B+b8a0RTV1tsGChW34rChpxF1H0A", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Ticket = _t, Fund = _t, Bought = _t, Person = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Ticket", type text}, {"Fund", type text}, {"Bought", Int64.Type}, {"Person", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Ticket", "Person"}, {{"All", each Table.Max(_,"Bought")}}),
#"Expanded All" = Table.ExpandRecordColumn(#"Grouped Rows", "All", {"Fund", "Bought"}, {"Fund", "Bought"})
in
#"Expanded All"
Hope this helps.
why we don't delete the first row? the ticket and person are also the same as the second and third rows.
Proud to be a Super User!
ah I missed that.. thanks for catching that
you can also do that in PQ
let
Source = Table,
Custom1 = Table.AddColumn(Source,"Rank",each Table.RowCount(Table.SelectRows(Source,(x)=>x[Bought]>[Bought] and x[Ticket]=[Ticket] and x[Person]=[Person]))+1),
#"Filtered Rows" = Table.SelectRows(Custom1, each ([Rank] = 1)),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Rank"})
in
#"Removed Columns"
pls see the attachment below
Proud to be a Super User!
User | Count |
---|---|
93 | |
85 | |
76 | |
66 | |
62 |
User | Count |
---|---|
112 | |
96 | |
94 | |
64 | |
58 |