Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
HI,
I hope someone can help with this.
After I merge 2 queries, some of my data duplicates into new rows in my data table. I am possibly using the wrong type of merge (left outer) but I was hoping someone can help stop this from happeneing. If I use the remove duplicates feature then it will remove some good data as well.
Solved! Go to Solution.
Hello @StephenK2022 ,
Thanks for the explanation, I have another question right now, what is the key of the new table because for example let we have the following table
if the key of my tables are columns: colA and colB, when I use the following selection and right click
I get the following result,
meaning that the key columns of my new tables are colA and colB so as long this combination of values is not reappearing in the dataset, the others will be removed. As you can see the second row with the duplication of 123, x was removed in the second appearing.
I hope this helps a little bit.
Kind regards,
Stefani Vileva
Hello,
Can you please explain what is the goal of this merge?
When you are using left join, then all the rows of the left table are trying to find their match in the second table, so for example if we have the following tables:
Table A
1, xxx, 123
2, yyy, 134
3, zzz, 145
Table B
123, hello
123, bye
145, goodbye
Then, when you try to make a left join of A with B, using the last col of A and the first of B as a matching key, the result is the following
1, xxx, 123, hello
1, xxx, 123, bye
2, yyy, 134, null
3, zzz, 145, goodbye
In order to remove the duplicates, I think it is best to select the key columns of your result table and then to remove douplicates as a group directly through the power query.
If you have any questions, please let me know.
Kind regards,
Stefani
Hi Stefani,
Thanks for your reply.
The goal of the merge is to combine two data sets so that the "Number up.No up" column matches the "Job code" column.
The number up column is held in a different file and I want to match it to job code so that I can use it for a calculation.
The problem with removing duplicates through power query is that some rows have similar data.
There might be 2 separate entries that have the same values but at different times.
Also, some timestamps are the same as well but have different data:
Hello @StephenK2022 ,
Thanks for the explanation, I have another question right now, what is the key of the new table because for example let we have the following table
if the key of my tables are columns: colA and colB, when I use the following selection and right click
I get the following result,
meaning that the key columns of my new tables are colA and colB so as long this combination of values is not reappearing in the dataset, the others will be removed. As you can see the second row with the duplication of 123, x was removed in the second appearing.
I hope this helps a little bit.
Kind regards,
Stefani Vileva
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
109 | |
95 | |
77 | |
65 | |
53 |
User | Count |
---|---|
144 | |
105 | |
102 | |
89 | |
63 |