Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.
Hello I am finding some similar questions but nothing for my exact situation. Relatively new to Power Query and Power BI. I am trying to find a way to remove a row that has duplicate data in one column but different data in the other. See my below picture:
Essentially I have some duplicate values in the SKU column and different values in the Forecast Month column. For every situation where there is a duplicate SKU value, I am trying to keep the Month Value row rather than the Billed Value Row. In the above example, I am trying to keep the green highlighted row, while removing the orange.
Ordinarily in Excel, I would just highlight duplicate values in SKUs, filter for the color of the highlight, filter Forecast Month to Billed, and then delete those rows. This would leave me with no duplicates in the SKU column then. Please respond if you can help me!
Hi @Kjoe2495, I'm refering to @spinfuzer sample data (same result with different approach):
Change 2nd step YourSource = Source (refer instead of Source to your data):
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSlTSUXLKzMlJTVGK1YlWSkLlJqNyU1C5qajcNEyuY0FRZg6Ylw7kuaUmFZUmFlXCBZBUZyBxYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [SKU = _t, #"Forecast Month" = _t]),
YourSource = Source,
GroupedRows = Table.Group(YourSource, {"SKU"}, {{"Forecast Month", each (if Table.RowCount(_) = 1 then _ else Table.SelectRows(_, (r)=> r[Forecast Month] <> "Billed")){0}[Forecast Month], type text}})
in
GroupedRows
Using custom sorting to sort by SKU and then by month not equal to Billed.
Table.Buffer this step.
Remove duplicates.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSlTSUXLKzMlJTVGK1YlWSkLlJqNyU1C5qajcNEyuY0FRZg6Ylw7kuaUmFZUmFlXCBZBUZyBxYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [SKU = _t, #"Forecast Month" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"SKU", type text}, {"Forecast Month", type text}}),
#"Sorted Rows" =
Table.Buffer(
Table.Sort(
#"Changed Type",
(x,y) =>
[
a = Value.Compare(x[SKU],y[SKU]),
b = if a <> 0 then a
else Value.Compare(x[Forecast Month]="Billed",y[Forecast Month]="Billed")
][b]
)
),
#"Removed Duplicates" = Table.Distinct(#"Sorted Rows", {"SKU"})
in
#"Removed Duplicates"