Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi all
Looking to amend rows based on duplicate count of such rows
For each duplicate count of a row, any instance requires to be the "main" row and then the rest are "duplicate rows" It needs to be completed within Power Query and not as a measure
Before:
Customer |
Adam |
Adam |
Adam |
Chris |
Chris |
Anthony |
After:
Customer | Edit |
Adam | Main |
Adam | Duplicate |
Adam | Duplicate |
Chris | Main |
Chris | Duplicate |
Anthony | Main |
Does anyone know how to achieve this?
Kind regards,
Solved! Go to Solution.
Hi @muggydaniel ,
In PQ you can use Table. Group and then from the resulting group, add an index column. Those with 0 index is the main row else the duplicate. Here's a sample code:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WckxJzFWK1cHGcM4oyixGYznmlWTk51UqxcYCAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Customer = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Customer", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Customer"}, {{"Group", each _, type table [Customer=nullable text]}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom", each Table.AddIndexColumn([Group], "Index" ), type table),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Group"}),
#"Expanded Custom" = Table.ExpandTableColumn(#"Removed Columns", "Custom", {"Index"}, {"Index"}),
#"Added Custom1" = Table.AddColumn(#"Expanded Custom", "Main/Duplicate", each if [Index] = 0 then "Main" else "Duplicate", type text)
in
#"Added Custom1"
For small tables, the approach above is fine but on a very large table I would prefer DAX as it is more optimized at scanning a very large table than PQ. After adding an index column and loading the table, I would create this calculated column (not a measure)
Main/Duplicate =
IF (
CALCULATE ( MIN ( 'DAX'[Index] ), ALLEXCEPT ( 'DAX', 'DAX'[Customer] ) ) = 'DAX'[Index],
"Main",
"Duplicate"
)
Please see attached pbix for your reference.
Proud to be a Super User!
Hi @muggydaniel ,
In PQ you can use Table. Group and then from the resulting group, add an index column. Those with 0 index is the main row else the duplicate. Here's a sample code:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WckxJzFWK1cHGcM4oyixGYznmlWTk51UqxcYCAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Customer = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Customer", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Customer"}, {{"Group", each _, type table [Customer=nullable text]}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom", each Table.AddIndexColumn([Group], "Index" ), type table),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Group"}),
#"Expanded Custom" = Table.ExpandTableColumn(#"Removed Columns", "Custom", {"Index"}, {"Index"}),
#"Added Custom1" = Table.AddColumn(#"Expanded Custom", "Main/Duplicate", each if [Index] = 0 then "Main" else "Duplicate", type text)
in
#"Added Custom1"
For small tables, the approach above is fine but on a very large table I would prefer DAX as it is more optimized at scanning a very large table than PQ. After adding an index column and loading the table, I would create this calculated column (not a measure)
Main/Duplicate =
IF (
CALCULATE ( MIN ( 'DAX'[Index] ), ALLEXCEPT ( 'DAX', 'DAX'[Customer] ) ) = 'DAX'[Index],
"Main",
"Duplicate"
)
Please see attached pbix for your reference.
Proud to be a Super User!
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
96 | |
94 | |
82 | |
70 | |
64 |
User | Count |
---|---|
116 | |
106 | |
94 | |
79 | |
72 |