Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Anonymous
Not applicable

Duplicates returned with Table.Distinct in M Query

Posted this by mistake in the Ideas section.  Reposting here.

 

Hi,

I am trying to create a table containing only distinct rows from the union of two existing Power Query tables in Excel.

 

My code (anonymised) is:

let
    SelectColumnsT1 = Table.SelectColumns(T1Data,{"Field1", "Field2", "Field3"}),
    SelectColumnsT2 = Table.SelectColumns(T2Data,{"Field1", "Field2", "Field3"}),
    CombineBoth = Table.Combine({SelectColumnsT1 , SelectColumnsT2 }),
    GetDistinct = Table.Distinct(CombineBoth,{"Field1", "Field2", "Field3"})
in
    GetDistinct

Field1 is an integer and the other fields are strings.

This returns duplicates in the resulting table.  I have checked the individual rows which are duplicates and there are no leading/trailing blanks, and when I check within excel that the fields in the duplicated rows are equal, the result is TRUE.

 

Am I misunderstanding the use of Table.Distinct?

Have I got the syntax wrong?

Is there a bug in this function?

Any other possible things I should look into to try to get to the bottom of this?

 

I would be grateful if anyone can give me any help on this.

 

Regards,

Mark

1 ACCEPTED SOLUTION

Hi,

This works

Untitled.png


Regards,
Ashish Mathur
http://www.ashishmathur.com
https://www.linkedin.com/in/excelenthusiasts/

View solution in original post

5 REPLIES 5
Ashish_Mathur
Super User
Super User

Hi,

Share some data and show the expected result.


Regards,
Ashish Mathur
http://www.ashishmathur.com
https://www.linkedin.com/in/excelenthusiasts/
Anonymous
Not applicable

Hi Ashish,

Thanks very much for replying.

The actual data is proprietary so I can't share the actual data, however, some made up data to illustrate the point:

 

Table 1

Year      Attr1       Attr2

2015     A                Red

2016     A                Red

2015     B                Red

2015     B                Blue

 

Table 2

Year      Attr1       Attr2

2015     A                Red

2016     A                Red

2015     B                Green

2015     B                Blue

 

First step is a union.

Combined table

Year      Attr1       Attr2

2015     A                Red  <- duplicate 1

2016     A                Red  <- duplicate 2

2015     B                Red

2015     B                Blue  <- duplicate 3

2015     A                Red  <- duplicate 1

2016     A                Red  <- duplicate 2

2015     B                Green

2015     B                Blue  <- duplicate 3

 

Next step is to get distinct rows.  I've indicated above the duplicates introduced by the union query.  So the distinct rows remove all but one instance of the duplicates.

Year      Attr1       Attr2

2015     A                Red

2016     A                Red

2015     B                Red

2015     B                Blue

2015     B                Green

 

The two data tables I'm working with are 291k and 69k records long.  The Table.Distinct query returns 5,284 distinct rows, however de-duping these 5,284 rows reduces the row count to 5,250, so there are 34 duplicated rows (only duplicates, no triplicates etc).  Hence it is *almost* successful in producing distinct rows, just not quite there.

Regards,

Mark

Hi,

This works

Untitled.png


Regards,
Ashish Mathur
http://www.ashishmathur.com
https://www.linkedin.com/in/excelenthusiasts/
Anonymous
Not applicable

Hi Ashish.

Your code is more or less the same as mine.

I have experimented on using an up to date version of Excel (my office uses Excel 2013, my personal laptop has the latest Excel 365).

The problem disappears on my version of Excel, so I think maybe I've uncovered a bug in the old version, which I guess I can't get around.

Thanks for spending the time to help me out.

Regards,

Mark

You are welcome.


Regards,
Ashish Mathur
http://www.ashishmathur.com
https://www.linkedin.com/in/excelenthusiasts/

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.