Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Anonymous
Not applicable

Duplicate rows show on report after doing the combination

I have three entities from the database, and I would like to make a report to combine all of these three entities' data. However, the result doesn't meet my expectations.


Entity 1: Brand_Table_ID: (customer_ID & Product_ID & Sub_type_product_ID)

Brand_

Table_ID

customer_IDProduct_ID

Sub_type_

product_ID

Brand_ValueMonth
AAA01_PPP01_01AAA01PPP0101A2021/06
AAA01_PPP01_02AAA01PPP0102A2021/06

 

 

Entity 2: Potential_Table_ID: (customer_ID & Product_ID & Sub_type_product_ID)

Potential_

Table_ID

customer_IDProduct_ID

Sub_type_

product_ID

Potential_ValueMonth
AAA01_PPP01_01AAA01PPP0101High2021/06
AAA01_PPP01_02AAA01PPP0102Low2021/06

 

Entity 3 Transaction_ID : (customer_ID & Product_ID & Sub_type_product_ID)

Transaction_IDcustomer_IDProduct_ID

Sub_type_

product_ID

Salesperson_IDMonth
AAA01_PPP01_01AAA01PPP0101SSS012021/06
AAA01_PPP01_02AAA01PPP0102SSS022021/06


After combining these three tables, I expect the report's result would be:

Unique_IDcustomer_IDProduct_ID

Sub_type_

product_ID

Salesperson_IDBrand_ValuePotential_ValueMonth
AAA01_PPP01_01AAA01PPP0101SSS01AHigh2021/06
AAA01_PPP01_02AAA01PPP0102SSS02ALow2021/06

 

However, I got this result instead

Unique_IDcustomer_IDProduct_ID

Sub_type_

product_ID

Salesperson_IDBrand_ValuePotential_ValueMonth
AAA01_PPP01_01AAA01PPP0101SSS01AHigh2021/06
AAA01_PPP01_01AAA01PPP0101SSS02AHigh2021/06
AAA01_PPP01_02AAA01PPP0102SSS01ALow2021/06
AAA01_PPP01_02AAA01PPP0102SSS02ALow2021/06

 

I would like to know what does the situation happens. Does it because the primary ID is not unique between these three tables?

If, yes, would the unique ID design as (customer_ID & Product_ID & Sub_type_product_ID & Salesperson_ID) or (customer_ID & Product_ID & Sub_type_product_ID & Salesperson_ID & Month) will be great?

Or, does it have extra knowledge I should know before I combine the tables?

1 ACCEPTED SOLUTION

Hi,
1.  You don't need to Merge theese 3 tables (creating the Join Table) if you wish to see data on "salesperson_ID" level.
You can create instead Dimensional table (table which, would contain only "salesperson_ID") and then then make 3 connections to those 3 tables.
So for example, you would connect Dimensional table with "salesperson_ID" column on one side with Brand table and "salesperson_ID" column on other side.

This should connect all your 3 tables on salesperson level.

2. Combining data or just joining them is a generally big question and there is no simple answer. The main goal here is to keep you model simple and easy to understand. So in case you would have 50 tables and you would like to create only Joins between them, it can be very chaotic and hard to understand. In this case you might merge/append some of those tables and reduce the number of final tables in your model.
Probably in most cases you will do both - some merging and some joining.



If my answer was helpful please give me a Kudos or even accept as a Solution.

Let's connect on LinkedIn!

View solution in original post

5 REPLIES 5
Migasuke
Super User
Super User

Hi @Anonymous ,

I did not really get, which columns you used for merging the queries.
If you use Product ID, then you will get duplicates.

If you merge your tables using "Unique ID" / "Transaction ID" / "Table ID", there should no be problem with any duplicates.

Duplicates appear, when you creating merge, with values, which are multiplied in the column. Basically the "relationship" would be One-Many or Many-Many.



If my answer was helpful please give me a Kudos or even accept as a Solution.

Let's connect on LinkedIn!

Anonymous
Not applicable

Hi @Migasuke,

 

Thanks for your reply. Sorry, I should explain more clear.

 

The entities 1 to 3 come from the database. And, the result gets from the data visualization part which I create a table from the report page.

 

Yes, one thing you made it correct which is the entity 2 and entity 3 are many to many relationship when I use the Potential_ID and Transaction_ID to connect these twho entities.

 

So, if I make the unique ID more specific such as add one or more attributes vlaues such as salesperson_ID and month, then, the duplicate problem can be solved? Or, I need to add more one entity to solve the many to many problem? I would like to seek your advice and experience to help me handle with this issue.

 

Thank you so much!

Hi Again,

if you want to create a custom Join column with Unique ID+Sales Person , it will not work since you don't have Sales Person column in all tables and the Merge would not work.

Your goal here, is to have same level of detail in all tables and then merge them. This table should give you the answer:

Migasuke_1-1626628054734.png


1. In First scenario, you have A table on Country level and B level on City level. If you merge them, your data will multiply (DE is 3x)

2 . In second scenario you group by your B table on countries, so then they contain the same level and merging does not create any duplicates.

3. In third case, you have both tables on City level, so you need to merge them based on Country and City and the same time (in this case using only City would be enough, but in theory there might be cities with same names in different countries). If you merge them only on Country level, you will get the duplicate the data again.

So the assumtion here is:
Take a look on all your three tables. Think, if you they have the same level of detail and then merge them using these "details" (in case three, detail would be City).



If my answer was helpful please give me a Kudos or even accept as a Solution.

Let's connect on LinkedIn!

Anonymous
Not applicable

Hi @Migasuke ,

 

Thanks for your incredible explanation.

 

I just have another further questions would like seek your help to clarify my understanding.

 

1.  If I have the salesperson_ID in these three tables, do I need to add one more Join table in the data modeling stage?Or, I can just add the value (salesperson_ID) into these three columns Brand_Table_ID, Potential_Table_ID, and Transaction_ID in the different tables, then I work at the report page and I can get the result as my expect, which solve the duplicate problem, without the merge and adding the Join table?

 

2. How can I know when I should add one more Join table or choose the merge process instead?

Hi,
1.  You don't need to Merge theese 3 tables (creating the Join Table) if you wish to see data on "salesperson_ID" level.
You can create instead Dimensional table (table which, would contain only "salesperson_ID") and then then make 3 connections to those 3 tables.
So for example, you would connect Dimensional table with "salesperson_ID" column on one side with Brand table and "salesperson_ID" column on other side.

This should connect all your 3 tables on salesperson level.

2. Combining data or just joining them is a generally big question and there is no simple answer. The main goal here is to keep you model simple and easy to understand. So in case you would have 50 tables and you would like to create only Joins between them, it can be very chaotic and hard to understand. In this case you might merge/append some of those tables and reduce the number of final tables in your model.
Probably in most cases you will do both - some merging and some joining.



If my answer was helpful please give me a Kudos or even accept as a Solution.

Let's connect on LinkedIn!

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.