Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
Anonymous
Not applicable

How to remove duplicate values from a column in a different table

Hello there! 


I have three columns that contain PO values (Purchase Orders), however, there are duplicates in one of them. This is because one PO column which is in a different table to the other two contains ALL the PO values. This PO columnis called "Purchase Order Number" and it's in a table called Purchase Orders, as shown below: 

KyleAdam_0-1629800278358.png

 

Now, in another table, there are two columns that contain several values from the column above, called Vendor 1 internal PO No. & Vendor 2 internal PO No. These two column are in a table called "Deals" as shown below:

KyleAdam_1-1629800677904.png

 

The issue is that the two columns above are more important as they contain several other important rows in their table. So what I want is the PO values in the two columns above to supersede the PO values in Purchase Order Number column I showed before. This is awkward and complicated because the original Purchase Order column has a total cost, so if I add the total costs of the other PO values I will get duplicated costs. This is primarily what I'm trying to avoid on my dashboard. I have tried appending the values to a new table and removing duplicates but that didn't seem to work, they weren't on matching rows for some reason. 

Any help would be appreciated.  

 

Side note: There is a relationship between the two tables, but it doesn't recognise that the PO values are unique. 

1 ACCEPTED SOLUTION
v-jingzhang
Community Support
Community Support

Hi @Anonymous 

 

How about using Power Query to create a new table to replace the original 'Purchase Orders' table?

 

First duplicate 'Deals' query in Power Query Editor, combine Vendor 1 internal PO No. & Vendor 2 internal PO No. columns into a single column and remove duplicated & null values. 

    #"Vendor 1 PO No " = #"Changed Type"[#"Vendor 1 PO No."],
    #"Vendor 2 PO No " = #"Changed Type"[#"Vendor 2 PO No."],
    #"All PO No" = List.Distinct(List.Combine({#"Vendor 1 PO No ",#"Vendor 2 PO No "})),
    #"Converted to Table" = Table.FromList(#"All PO No", Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Filtered Rows" = Table.SelectRows(#"Converted to Table", each [Column1] <> null and [Column1] <> ""),

082601.jpg

 

Then use Merge Queries feature to combine data from Purchase Orders table into this new table. Select Purchase Order Number column as the matching column. For Join Kind, if you want to keep all PO Numbers from both tables, select Full Outer. If you only want to keep PO Numbers from the new table, select Left Outer. Then you will get a new column which contains data from Purchase Orders table.

082602.jpg

 

At last, expand the Purchase Orders column and select columns you want to keep in the new table.

082603.jpg

 

You will get a new table in which the PO Number column has all distinct values from two columns in Deals table and other columns from 'Purchase Orders' table. They are matched on the PO Number values. You can use this new table to replace the original 'Purchase Orders' table in your model. You can remove old relationships between tables and create new relationships with this new table.

 

Attached a pbix for your reference.

 

Regards,
Community Support Team _ Jing
If this post helps, please Accept it as the solution to help other members find it.

View solution in original post

2 REPLIES 2
v-jingzhang
Community Support
Community Support

Hi @Anonymous 

 

How about using Power Query to create a new table to replace the original 'Purchase Orders' table?

 

First duplicate 'Deals' query in Power Query Editor, combine Vendor 1 internal PO No. & Vendor 2 internal PO No. columns into a single column and remove duplicated & null values. 

    #"Vendor 1 PO No " = #"Changed Type"[#"Vendor 1 PO No."],
    #"Vendor 2 PO No " = #"Changed Type"[#"Vendor 2 PO No."],
    #"All PO No" = List.Distinct(List.Combine({#"Vendor 1 PO No ",#"Vendor 2 PO No "})),
    #"Converted to Table" = Table.FromList(#"All PO No", Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Filtered Rows" = Table.SelectRows(#"Converted to Table", each [Column1] <> null and [Column1] <> ""),

082601.jpg

 

Then use Merge Queries feature to combine data from Purchase Orders table into this new table. Select Purchase Order Number column as the matching column. For Join Kind, if you want to keep all PO Numbers from both tables, select Full Outer. If you only want to keep PO Numbers from the new table, select Left Outer. Then you will get a new column which contains data from Purchase Orders table.

082602.jpg

 

At last, expand the Purchase Orders column and select columns you want to keep in the new table.

082603.jpg

 

You will get a new table in which the PO Number column has all distinct values from two columns in Deals table and other columns from 'Purchase Orders' table. They are matched on the PO Number values. You can use this new table to replace the original 'Purchase Orders' table in your model. You can remove old relationships between tables and create new relationships with this new table.

 

Attached a pbix for your reference.

 

Regards,
Community Support Team _ Jing
If this post helps, please Accept it as the solution to help other members find it.

Anonymous
Not applicable

Thank you @v-jingzhang that was a very good solution! 

Helpful resources

Announcements
LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.