Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
Anonymous
Not applicable

Remove duplicates while retaining null/blank rows in a column

Need to remove duplicate values from a column that had a lot of null values too. Need to retain the null rows.

Tried using query editor's remove duplicate rows function, but it removes all null values too.

 

Initial ListInitial List

 

Result after using remove duplicate rows function. Where as all 7 null rows from initial list has  to be retainedResult after using remove duplicate rows function. Where as all 7 null rows from initial list has to be retained

 

 

Result after using remove duplicate rows function. Whereas all 7 null rows from initial list has to be retained.

I'm attaching the sample file for further persual. Sample file 

1 ACCEPTED SOLUTION
ziying35
Impactful Individual
Impactful Individual

Hi, @Anonymous 

 

let
    Source = Table.FromRecords(Json.Document(Binary.Decompress(Binary.FromText("i65WSlSyUko2NDJV0lFKAjIrKlKUanVgwiZQ0eLEdLhooiFMbTFcMK80JwcmVoxNJValSIZWGRlClaan58BF0+AOSM/G1J+ajc2q3JxiTKW5SEqT4Y4qT1WqjQUA",BinaryEncoding.Base64),Compression.Deflate))),
    result = Table.FromRows(List.Distinct(Table.ToRows(Source),each if _{0}=null then Text.NewGuid() else _{0}))
in
    result

If my code can solve your problem, Mark it as a solution

View solution in original post

4 REPLIES 4
ziying35
Impactful Individual
Impactful Individual

Hi, @Anonymous 

 

let
    Source = Table.FromRecords(Json.Document(Binary.Decompress(Binary.FromText("i65WSlSyUko2NDJV0lFKAjIrKlKUanVgwiZQ0eLEdLhooiFMbTFcMK80JwcmVoxNJValSIZWGRlClaan58BF0+AOSM/G1J+ajc2q3JxiTKW5SEqT4Y4qT1WqjQUA",BinaryEncoding.Base64),Compression.Deflate))),
    result = Table.FromRows(List.Distinct(Table.ToRows(Source),each if _{0}=null then Text.NewGuid() else _{0}))
in
    result

If my code can solve your problem, Mark it as a solution

v-lid-msft
Community Support
Community Support

Hi @Anonymous ,

 

If you want to remove duplicate rows from null-blank rows and keep all null rows, we can add a step to meet your requirement:

 

SelectDesireTable = Table.Combine({Table.SelectRows(NameOfYourLastStep, each [Column1] = null),Table.Distinct(Table.SelectRows(NameOfYourLastStep, each [Column1] <> null), {"Column1"})})

 

If the distinct logic depens on multi columns

 

SelectDesireTable = Table.Combine({
        Table.SelectRows(NameOfYourLastStep, each [Column1] = null and [Column2] = null),
        Table.Distinct(Table.SelectRows(NameOfYourLastStep, each [Column1] <> null or [Column2] <> null), {"Column1","Column2"})})

 

All queries are here:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("dcc3DsAwDATBv7BW41w75+xO0P+/4TULdgZugFvvJRInJSoJzkvMq9FoJbwWnVbK6zFYjZi0Mt6MxWrFppXzdhxWJy4tzrf/W3BuPBLCCw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t, Column3 = _t]),
    NameOfYourLastStep = Table.TransformColumnTypes(Source,{{"Column1", Int64.Type}, {"Column2", type text}, {"Column3", type text}}),
    SelectDesireTable = Table.Combine({Table.SelectRows(NameOfYourLastStep, each [Column1] = null),Table.Distinct(Table.SelectRows(NameOfYourLastStep, each [Column1] <> null), {"Column1"})})
in
    SelectDesireTable

 

 

 6.jpg7.jpg

 

If you want to keep the order, we can create a index column first, then create a custom column using following formula:

 

let c = _, 
temp = Table.SelectRows(#"Added Index",each [Column1] = c[Column1]),
rowcount =  Table.RowCount(temp)
in if rowcount=1 or [Column1]=null then [Index] else Table.Max(temp,"Index")[Index]

 

If you want to remove depends on multi columns:

 

let c = _, 
temp = Table.SelectRows(#"Added Index",each [Column1] = c[Column1] and [Column2] = c[Column2]),
rowcount =  Table.RowCount(temp)
in if rowcount=1 or ([Column1]=null and [Column2]=null) then [Index] else Table.Max(temp,"Index")[Index]

 

All the queries are here (Table (2) in sample file) :

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("dcc3DsAwDATBv7BW41w75+xO0P+/4TULdgZugFvvJRInJSoJzkvMq9FoJbwWnVbK6zFYjZi0Mt6MxWrFppXzdhxWJy4tzrf/W3BuPBLCCw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t, Column3 = _t]),
    NameOfYourLastStep = Table.TransformColumnTypes(Source,{{"Column1", Int64.Type}, {"Column2", type text}, {"Column3", type text}}),
    #"Added Index" = Table.AddIndexColumn(NameOfYourLastStep, "Index", 0, 1),
    #"Added Custom" = Table.AddColumn(#"Added Index", "Custom", each let c = _, 
temp = Table.SelectRows(#"Added Index",each [Column1] = c[Column1]),
rowcount =  Table.RowCount(temp)
in if rowcount=1 or [Column1]=null then [Index] else Table.Max(temp,"Index")[Index]),
    #"Removed Duplicates" = Table.Distinct(#"Added Custom", {"Custom"}),
    #"Removed Columns" = Table.RemoveColumns(#"Removed Duplicates",{"Index", "Custom"})
in
    #"Removed Columns"

 

After remove the duplicate rows depends on custom column and the additional column, we can get the desire result:

 

8.jpg9.jpg

 

 

If it doesn't meet your requirement, Could you please show the exact expected result based on the tables that we have shared?


By the way, PBIX file as attached.


Best regards,

 

Community Support Team _ Dong Li
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Anonymous
Not applicable

Hi @Anonymous ,

 

1. Filter out all nulls from your source table and then run Remove duplicates

2. Filter out  all non-nulls from your original table

3. Combine 1 + 2.

 

Kind regards,

JB

 

edhans
Super User
Super User

You need to supply some data. You cannot use the Table.Distinct function here because 2 nulls are duplicates, and it will remove them. 

 

This can be done, but not with the menu options. See links below to share data.

How to get good help fast. Help us help you.
How to Get Your Question Answered Quickly
How to provide sample data in the Power BI Forum



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors