cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Highlighted
Frequent Visitor

Remove duplicates while retaining null/blank rows in a column

Need to remove duplicate values from a column that had a lot of null values too. Need to retain the null rows.

Tried using query editor's remove duplicate rows function, but it removes all null values too.

 

Initial ListInitial List

 

Result after using remove duplicate rows function. Where as all 7 null rows from initial list has  to be retainedResult after using remove duplicate rows function. Where as all 7 null rows from initial list has to be retained

 

 

Result after using remove duplicate rows function. Whereas all 7 null rows from initial list has to be retained.

I'm attaching the sample file for further persual. Sample file 

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted
Resolver II
Resolver II

Re: Remove duplicates while retaining null/blank rows in a column

Hi, @powerman101 

 

let
    Source = Table.FromRecords(Json.Document(Binary.Decompress(Binary.FromText("i65WSlSyUko2NDJV0lFKAjIrKlKUanVgwiZQ0eLEdLhooiFMbTFcMK80JwcmVoxNJValSIZWGRlClaan58BF0+AOSM/G1J+ajc2q3JxiTKW5SEqT4Y4qT1WqjQUA",BinaryEncoding.Base64),Compression.Deflate))),
    result = Table.FromRows(List.Distinct(Table.ToRows(Source),each if _{0}=null then Text.NewGuid() else _{0}))
in
    result

If my code can solve your problem, Mark it as a solution

View solution in original post

4 REPLIES 4
Highlighted
Super User II
Super User II

Re: Remove duplicates while retaining null/blank rows in a column

You need to supply some data. You cannot use the Table.Distinct function here because 2 nulls are duplicates, and it will remove them. 

 

This can be done, but not with the menu options. See links below to share data.

How to get good help fast. Help us help you.
How to Get Your Question Answered Quickly
How to provide sample data in the Power BI Forum



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting
Highlighted
Super User I
Super User I

Re: Remove duplicates while retaining null/blank rows in a column

Hi @powerman101 ,

 

1. Filter out all nulls from your source table and then run Remove duplicates

2. Filter out  all non-nulls from your original table

3. Combine 1 + 2.

 

Kind regards,

JB

 

Highlighted
Community Support
Community Support

Re: Remove duplicates while retaining null/blank rows in a column

Hi @powerman101 ,

 

If you want to remove duplicate rows from null-blank rows and keep all null rows, we can add a step to meet your requirement:

 

SelectDesireTable = Table.Combine({Table.SelectRows(NameOfYourLastStep, each [Column1] = null),Table.Distinct(Table.SelectRows(NameOfYourLastStep, each [Column1] <> null), {"Column1"})})

 

If the distinct logic depens on multi columns

 

SelectDesireTable = Table.Combine({
        Table.SelectRows(NameOfYourLastStep, each [Column1] = null and [Column2] = null),
        Table.Distinct(Table.SelectRows(NameOfYourLastStep, each [Column1] <> null or [Column2] <> null), {"Column1","Column2"})})

 

All queries are here:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("dcc3DsAwDATBv7BW41w75+xO0P+/4TULdgZugFvvJRInJSoJzkvMq9FoJbwWnVbK6zFYjZi0Mt6MxWrFppXzdhxWJy4tzrf/W3BuPBLCCw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t, Column3 = _t]),
    NameOfYourLastStep = Table.TransformColumnTypes(Source,{{"Column1", Int64.Type}, {"Column2", type text}, {"Column3", type text}}),
    SelectDesireTable = Table.Combine({Table.SelectRows(NameOfYourLastStep, each [Column1] = null),Table.Distinct(Table.SelectRows(NameOfYourLastStep, each [Column1] <> null), {"Column1"})})
in
    SelectDesireTable

 

 

 6.jpg7.jpg

 

If you want to keep the order, we can create a index column first, then create a custom column using following formula:

 

let c = _, 
temp = Table.SelectRows(#"Added Index",each [Column1] = c[Column1]),
rowcount =  Table.RowCount(temp)
in if rowcount=1 or [Column1]=null then [Index] else Table.Max(temp,"Index")[Index]

 

If you want to remove depends on multi columns:

 

let c = _, 
temp = Table.SelectRows(#"Added Index",each [Column1] = c[Column1] and [Column2] = c[Column2]),
rowcount =  Table.RowCount(temp)
in if rowcount=1 or ([Column1]=null and [Column2]=null) then [Index] else Table.Max(temp,"Index")[Index]

 

All the queries are here (Table (2) in sample file) :

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("dcc3DsAwDATBv7BW41w75+xO0P+/4TULdgZugFvvJRInJSoJzkvMq9FoJbwWnVbK6zFYjZi0Mt6MxWrFppXzdhxWJy4tzrf/W3BuPBLCCw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t, Column3 = _t]),
    NameOfYourLastStep = Table.TransformColumnTypes(Source,{{"Column1", Int64.Type}, {"Column2", type text}, {"Column3", type text}}),
    #"Added Index" = Table.AddIndexColumn(NameOfYourLastStep, "Index", 0, 1),
    #"Added Custom" = Table.AddColumn(#"Added Index", "Custom", each let c = _, 
temp = Table.SelectRows(#"Added Index",each [Column1] = c[Column1]),
rowcount =  Table.RowCount(temp)
in if rowcount=1 or [Column1]=null then [Index] else Table.Max(temp,"Index")[Index]),
    #"Removed Duplicates" = Table.Distinct(#"Added Custom", {"Custom"}),
    #"Removed Columns" = Table.RemoveColumns(#"Removed Duplicates",{"Index", "Custom"})
in
    #"Removed Columns"

 

After remove the duplicate rows depends on custom column and the additional column, we can get the desire result:

 

8.jpg9.jpg

 

 

If it doesn't meet your requirement, Could you please show the exact expected result based on the tables that we have shared?


By the way, PBIX file as attached.


Best regards,

 

Community Support Team _ Dong Li
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Highlighted
Resolver II
Resolver II

Re: Remove duplicates while retaining null/blank rows in a column

Hi, @powerman101 

 

let
    Source = Table.FromRecords(Json.Document(Binary.Decompress(Binary.FromText("i65WSlSyUko2NDJV0lFKAjIrKlKUanVgwiZQ0eLEdLhooiFMbTFcMK80JwcmVoxNJValSIZWGRlClaan58BF0+AOSM/G1J+ajc2q3JxiTKW5SEqT4Y4qT1WqjQUA",BinaryEncoding.Base64),Compression.Deflate))),
    result = Table.FromRows(List.Distinct(Table.ToRows(Source),each if _{0}=null then Text.NewGuid() else _{0}))
in
    result

If my code can solve your problem, Mark it as a solution

View solution in original post

Helpful resources

Announcements
Community Blog

Community Blog

Visit our Community Blog for articles, guides, and information created by fellow community members.

Using the Community

Using the Community

Need help with the Power BI Community? Our 'Using the Community' support articles are a great place to start.

Community Summit North America

Community Summit North America

Innovate, Collaborate, Grow. The top training and networking event across the globe for Microsoft Business Applications

Power Platform 2020 release wave 2 plan

Power Platform 2020 release wave 2 plan

Features releasing from October 2020 through March 2021