Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
I am pulling a customer table which is basically a sales table with all columns, except customer email, removed. I then 'remove duplicates' on the customer email column in Editor, essentially giving me a distinct list of all customer emails that have made a purchase with us. However, when i try to create a relationship between this column and 'customer email' in the sales table, i get an error stating that one of the 2 columns being related must be without duplicates - as i already do a distinct on the customer table, this should not happen.
Does anyone know if using a 'remove duplicates' is the reason this issue occurs and if there is a work around ?
Thanks!
Hi @S_M
Could you please explain the scenario with example :-
Like
Dataset 1
id xyz abc
1 a b
Dataset 2
id xyz abc
1 a b
Relationship between Dataset 1 and Dataset 2
and in which step you are facing challenge.
Hi @kaushikd,
To break it down : I have a sales table (Table1):
order_number customer_email
1 abc@xyz.com
2 def@xyz.com
3 ghi@xyz.com
4 def@xyz.com
I use Table1 to create a new customers table (Table2) which only has the row 'customer_email', removing duplicates, so essentially it looks like this :
Table2:
customer_email
abc@xyz.com
def@xyz.com
ghi@xyz.com
I am now trying to relate Table2 to Table1 using 'customer_email' as primary key, however it doesnt allow me to create a relationship by saying that both tables do not contain distinct values. So i am stumped at why this would happen and how to go about it
Hi I have taken the same data as you have provided:-
and tried the following followed the steps:-
1. Imported the Table1
order_number customer_email
1 abc@xyz.com
2 def@xyz.com
3 ghi@xyz.com
4 def@xyz.com
from excel file
2. Go to Edit Query--> Right Click on Table1-->Select Duplicate-->Renamed the new Table as Table2
3. For Table2-->Right Click on order_number Column and Remove.-->Right Click on customer_email Column and Remove Duplicates.
4. Close and Apply.
5.Go to Relationship
you will find power bi will automatically do a join for both the table or else please do it by your own.
If this help you out please accept this as a solution thanks
@kaushikd, unfortunately this is similar to what ive done and i face the problem above, which is that i am unable to entirely remove duplicates.
@ovetteabejuela, good idea! I actually did that and found one sneaky duplicate that is somehow not being detected by PowerQuery when removing duplicates. On excel, these duplicates get highlighted, however i ran some trim / length functions to understand why it is not being detected as a duplicate, but they are exactly alike. Any idea how i can remove these on editor ?
Hi @S_M,
Maybe you could try to remove dupplicate in Query editor by using Power Query. Here is a similar thread for you reference.
https://community.powerbi.com/t5/Desktop/Remove-or-Hide-Duplicate-Values/td-p/10409
Table.Distinct
https://msdn.microsoft.com/en-us/library/mt260775.aspx
Regards,
Charlie Liao
I would also try to do a little validation.
You can export the table to excel and do a Remove Duplicates there and see if it did find duplicates.
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
107 | |
98 | |
78 | |
65 | |
60 |
User | Count |
---|---|
148 | |
113 | |
97 | |
84 | |
67 |