cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
DataGuy2016
Advocate II
Advocate II

Table Merge creates duplicate records

Hi Folks,

 

I am stumped.

 

I am trying to merge two tables into a new table using a LEFT JOIN.  Both tables have unique records on each row.  When I join the tables, BI creates duplicate rows on some records for no apparent reason.  The duplicates are identical in every way.

 

Can anybody explain this?  Is this a bug?

1 ACCEPTED SOLUTION
DataGuy2016
Advocate II
Advocate II

The problem lay with the Data.

 

The LEFT JOIN functionality of PowerBI works perfectly.

View solution in original post

32 REPLIES 32
gpfox
New Member

I had same problem. I was joining Table A with Table B on 2 columns, that were numbers but marked as Text type.  I changed the type of these columns I was joining on in both tables to Whole number type. Then the Left Join worked and the duplicates went away!

PaulSeeq
Frequent Visitor

I had the same problem and found my solution so thought I'd share it, for what its worth.

 

I had a very large table of data and another table of account details that I was merging to my first data set, but was getting duplicate rows.

 

There was a unique key to link between the 2 tables but hadn't created a relationship on the model window. As it turns out it was a many to many relationship and this is what caused the dulpicates. Since the account table was only there to extract additional info to my first data set, in the account table in the Power Query editor window I simply clicked on my key column and selected remove duplicates to force a many to one relationship and this removed the duplication on merge.

I don't really get why the people commenting after my comment are still looking for an answer!!!!

 

my comment was litterally a valid solution, will I actually didn't really understand how merge works back then when I looked for this post, people are still missing merge with the regular vlookup function.

 

vlookup looks only for the first value in a column.

merge brings all the instances of this value no matter how many there are, and it duplicates the value in the first table to match that count, it works this way and there is no problem in this, the logic is different (it simply doesn't know which instance you wish to have so it brings you all of them).

 

all you have to do is to remove duplicates, you have to do it in the second table you're merging with (before you do the merge), or you have to do it the way I described in my last "solution" comment (you may look here in the previous comments).

 

please mark my comment as a solution so people don't keep getting lost everytime they enter this thread.

 

Regards for all of you.

baronraghu
Helper III
Helper III

Hi All,

 

I having similar issue, Below are the tables and the output after using full outer join

 

Table1 

image.png

 

Table 2

image.png

 

Merging query screenshot

image.png

 

Merged table with duplicate entries- actually it copies table 2 for each entry of table 1

image.png

 

Any solution

ksatpati
New Member

Had the same issue, where I was merging queries.

The issue in my instance was that the base query was expanded which resulted in duplicates in the base query.

Once the expansion was removed in the base query the merged query was fine. 

So if data is okay, the issue may be caused by duplicates in the base query.

TKurt
Frequent Visitor

Can you explain "base query was expanded"?

othmanalazzam
Frequent Visitor

Hi people.

 

I've been looking for this issue with no luck as well until I founded a valid work around.

 

the work around goes like this ::

 

-Merge your sheets as you like (you don't have to merge on more than one column, one is fine).

 

-Add index column

 

-Expand your merged column (this will create duplicate values, the main issue we're looking into).

 

-since the values in the "index column" were purely unique, now Remove duplicates using index column.

 

-and now your data is back to original as it was inteded to be.


@othmanalazzam wrote:

Hi people.

 

I've been looking for this issue with no luck as well until I founded a valid work around.

 

the work around goes like this ::

 

-Merge your sheets as you like (you don't have to merge on more than one column, one is fine).

 

-Add index column

 

-Expand your merged column (this will create duplicate values, the main issue we're looking into).

 

-since the values in the "index column" were purely unique, now Remove duplicates using index column.

 

-and now your data is back to original as it was inteded to be.


My duplicate issue is solved by applying this method. Thank you very much.

RodgerB
Frequent Visitor

I've been working with a sales table and trying to create dimension tables from the table with all of the fields.  I've been frustrated, because I could not tell where the duplicate rows of data were coming from; however, could tell they happened after one of the two merge query steps.

 

I can affirm that having a unique index prior to the merge, and then removing duplicates from this field works.

 

Does anyone know why this occurs?

Anonymous
Not applicable

I'm facing the same trouble here. Not sure why merging 2 tables create duplicates. Have anyone got a work around this? 

 

Update 9/10 : Problem Solved. My original first table already have got unique rows hence I should have merged it as aggregated values otherwise it will further expand the table, causing duplicates. Also, it helps to use and select more than one field as the key to look up if you need that level of granularity. 

GenericUsername
Frequent Visitor

I just ran into the same problem and fixed it by joining the two tables on multiple fields.  For example, let's say a transaction number is the unique field you are joining on, expand the join to include other fields if possible.  So your joins would be on transaction number, time, and amount.  

Yes, that's what I've done too. However, even that solution created some duplicates as we don't have a one to one mappings in some cases. The only solution I could come up with (and one that would've slow down the query significantly) was to remove duplicates against the columns which then were used in merge. To make sure there's only one line per particular combination of columns. It's similar to how vlookup works. It brings up only the first occurrence for duplicates. However, since I'm new to PQ there might be things which I am not aware of. Thus waiting for some more experienced users to chip in.
TS1989
Regular Visitor

I've had the same problem for some time now. It looks like it's 'fixed' after changing the privacy settings to 'Always ignore Privacy Level settings'.

SGLake
Frequent Visitor

I changed the privacy settings and still have the issue too. To work around I just used the remove duplicates feature, I know this is not ideal but I know my table only has one line for each record.

DataGuy2016
Advocate II
Advocate II

The problem lay with the Data.

 

The LEFT JOIN functionality of PowerBI works perfectly.

View solution in original post

Hi I’m having the exact same issue when merging two tables, I see you posted that the issue lays with the data and not the merge, can I ask how you solved it? I’m pretty sure I have unique values so I believe that is not the issue, actually the merge is working on someone else’s computer but not on mine. Appreciate any hint you may provide.

Sadly, I still have not been able to fix this problem with the "Merge" feature.  I have resorted to using "Append" instead - and in some cases it works and in others it doesn't.  It is a very frustrating situation.

I got the same problem.

Source table is an daily order table with duplicated rows, i.e. SKU ID and merged with another matching table with unique rows of SKU ID, and created randomly duplicated rows.

 

Whatever I tried to remove duplicates or aggregated by median still not get the correct final data.

 

My order data could get up to 20K rows and it's easily get wrong while I merged the tables.

 

That is so annoying.

 

Hope POWER BI experts can really solve this problem.

 

Thanks.

Hi, is there anyone who can help with this? I have the same issue and there seems to be no solution for it apart from Table.First, but this function slows down the query...

I'm having the same issue thanks. What did you do to fix your data set? 

Helpful resources

Announcements
MBAS on Demand

2021 Release Wave 2 Plan

Power Platform release plan for the 2021 release wave 2 describes all new features releasing from October 2021 through March 2022.

July 2021 Update 768x460.png

Check it out!

Click here to read more about the July 2021 Updates

Power Query PA Forum 768x460.png

Check it out!

Did you know that you can visit the Power Query Forum in Power BI and now Power Apps

Top Solution Authors
Top Kudoed Authors