Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Need to remove duplicate values from a column that had a lot of null values too. Need to retain the null rows.
Tried using query editor's remove duplicate rows function, but it removes all null values too.
Result after using remove duplicate rows function. Whereas all 7 null rows from initial list has to be retained.
I'm attaching the sample file for further persual. Sample file
Solved! Go to Solution.
Hi, @Anonymous
let
Source = Table.FromRecords(Json.Document(Binary.Decompress(Binary.FromText("i65WSlSyUko2NDJV0lFKAjIrKlKUanVgwiZQ0eLEdLhooiFMbTFcMK80JwcmVoxNJValSIZWGRlClaan58BF0+AOSM/G1J+ajc2q3JxiTKW5SEqT4Y4qT1WqjQUA",BinaryEncoding.Base64),Compression.Deflate))),
result = Table.FromRows(List.Distinct(Table.ToRows(Source),each if _{0}=null then Text.NewGuid() else _{0}))
in
result
If my code can solve your problem, Mark it as a solution
Hi, @Anonymous
let
Source = Table.FromRecords(Json.Document(Binary.Decompress(Binary.FromText("i65WSlSyUko2NDJV0lFKAjIrKlKUanVgwiZQ0eLEdLhooiFMbTFcMK80JwcmVoxNJValSIZWGRlClaan58BF0+AOSM/G1J+ajc2q3JxiTKW5SEqT4Y4qT1WqjQUA",BinaryEncoding.Base64),Compression.Deflate))),
result = Table.FromRows(List.Distinct(Table.ToRows(Source),each if _{0}=null then Text.NewGuid() else _{0}))
in
result
If my code can solve your problem, Mark it as a solution
Hi @Anonymous ,
If you want to remove duplicate rows from null-blank rows and keep all null rows, we can add a step to meet your requirement:
SelectDesireTable = Table.Combine({Table.SelectRows(NameOfYourLastStep, each [Column1] = null),Table.Distinct(Table.SelectRows(NameOfYourLastStep, each [Column1] <> null), {"Column1"})})
If the distinct logic depens on multi columns
SelectDesireTable = Table.Combine({
Table.SelectRows(NameOfYourLastStep, each [Column1] = null and [Column2] = null),
Table.Distinct(Table.SelectRows(NameOfYourLastStep, each [Column1] <> null or [Column2] <> null), {"Column1","Column2"})})
All queries are here:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("dcc3DsAwDATBv7BW41w75+xO0P+/4TULdgZugFvvJRInJSoJzkvMq9FoJbwWnVbK6zFYjZi0Mt6MxWrFppXzdhxWJy4tzrf/W3BuPBLCCw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t, Column3 = _t]),
NameOfYourLastStep = Table.TransformColumnTypes(Source,{{"Column1", Int64.Type}, {"Column2", type text}, {"Column3", type text}}),
SelectDesireTable = Table.Combine({Table.SelectRows(NameOfYourLastStep, each [Column1] = null),Table.Distinct(Table.SelectRows(NameOfYourLastStep, each [Column1] <> null), {"Column1"})})
in
SelectDesireTable
If you want to keep the order, we can create a index column first, then create a custom column using following formula:
let c = _,
temp = Table.SelectRows(#"Added Index",each [Column1] = c[Column1]),
rowcount = Table.RowCount(temp)
in if rowcount=1 or [Column1]=null then [Index] else Table.Max(temp,"Index")[Index]
If you want to remove depends on multi columns:
let c = _,
temp = Table.SelectRows(#"Added Index",each [Column1] = c[Column1] and [Column2] = c[Column2]),
rowcount = Table.RowCount(temp)
in if rowcount=1 or ([Column1]=null and [Column2]=null) then [Index] else Table.Max(temp,"Index")[Index]
All the queries are here (Table (2) in sample file) :
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("dcc3DsAwDATBv7BW41w75+xO0P+/4TULdgZugFvvJRInJSoJzkvMq9FoJbwWnVbK6zFYjZi0Mt6MxWrFppXzdhxWJy4tzrf/W3BuPBLCCw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t, Column3 = _t]),
NameOfYourLastStep = Table.TransformColumnTypes(Source,{{"Column1", Int64.Type}, {"Column2", type text}, {"Column3", type text}}),
#"Added Index" = Table.AddIndexColumn(NameOfYourLastStep, "Index", 0, 1),
#"Added Custom" = Table.AddColumn(#"Added Index", "Custom", each let c = _,
temp = Table.SelectRows(#"Added Index",each [Column1] = c[Column1]),
rowcount = Table.RowCount(temp)
in if rowcount=1 or [Column1]=null then [Index] else Table.Max(temp,"Index")[Index]),
#"Removed Duplicates" = Table.Distinct(#"Added Custom", {"Custom"}),
#"Removed Columns" = Table.RemoveColumns(#"Removed Duplicates",{"Index", "Custom"})
in
#"Removed Columns"
After remove the duplicate rows depends on custom column and the additional column, we can get the desire result:
If it doesn't meet your requirement, Could you please show the exact expected result based on the tables that we have shared?
By the way, PBIX file as attached.
Best regards,
Hi @Anonymous ,
1. Filter out all nulls from your source table and then run Remove duplicates
2. Filter out all non-nulls from your original table
3. Combine 1 + 2.
Kind regards,
JB
You need to supply some data. You cannot use the Table.Distinct function here because 2 nulls are duplicates, and it will remove them.
This can be done, but not with the menu options. See links below to share data.
How to get good help fast. Help us help you.
How to Get Your Question Answered Quickly
How to provide sample data in the Power BI Forum
DAX is for Analysis. Power Query is for Data Modeling
Proud to be a Super User!
MCSA: BI ReportingCovering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.