Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Anonymous
Not applicable

Merge Query - missing rows & performance issues

 

 

I'm stumped and maybe I'm missing something obvious, hoping to find a working solution.

Power BI connect to a SQL database which pulls in two tables as separate queries - one for sales orders (listing products purchased, 1.3M records) as well as a product list (300 records) which is used as lookup table to pull price, categories, additional fees etc. etc. The product code that is unique identifier in the product list has numbers and letters which I pulled into PBI as "text" field. The issue I'm running into is that when I try to do a "merge" of the two tables in PBI, it fails to merge all the records. It will stop at 1,022 records. I've learned that using text field is problematic for merges, these are things I've tried:

 

  • transforming the product code to uppercase and trim but problem is still there.
  • using both product code and product name as related fields for the merge but it appears the names are not exactly following same formatting so it only matches half the records (500K records)
  • use fuzzy match but then it will take more than an hour to "refresh" so that won't work.

    Sample Database Diagram for Power BI questions.png

The other thing to note is the sale order table has 1.3M records so I'm wondering if that is part of the issue as well? Would it be worthwhile to create my own custom unique ID and substitute the product code field in both tables? A bit of a pain but I don't see easier way to do this?

 

The reason I need this is so I can pull in the needed details from product list into the sales order when I aggregate by categories (from prod list) and month of sale (from sales order). Originally I thought I could create a Measure and Concatenate but that doesn't work and everyone says to "merge" data instead. Otherwise, I'm thinking maybe I need to make a SQL view then pull from that into PBI due to it's limitations? Thoughts?


Thanks!

7 REPLIES 7
watkinnc
Super User
Super User

Hey there.   I bet this is the classic M issue that Chris Webb used to talk about with his "Price Paid" queries.

 

Try using this for your merge Function:

 

let
    Source = Table.NestedJoin(Product_List, {"product code"}, Table.Buffer(Table.Distinct(Sales_Orders, "product code")), {"product code"}, "Table1", JoinKind.LeftOuter),
    #"Expanded Table1" = Table.ExpandTableColumn(Source, "Table1", {"Sales Order", "Date", "Qty", "product code", "Product Name"}, {"Sales Order", "Date", "Qty", "product code.1", "Product Name.1"})
in
    #"Expanded Table1"

 

 

//With this, you are telling the function that once it finds a match, it doesn't have to keep looking, because the table column "product code" is distinct.  Buffering the table stops it from re-calling the table.

 

Try it out!---Nate


I’m usually answering from my phone, which means the results are visualized only in my mind. You’ll need to use my answer to know that it works—but it will work!!

Also, you better make sure that your column names are tighter--Product Code, product code, product Code, etc.!


I’m usually answering from my phone, which means the results are visualized only in my mind. You’ll need to use my answer to know that it works—but it will work!!
Anonymous
Not applicable

Thanks, I'll check into using that query. I don't follow what you mean by product name should be tighter? Do you mean the fact it has upper and lower case? That's the thing, this is someone else's database because if it was mine, for starters I wouldn't have alphanumeric characters in my product code field. The product name is another headache which is why I've been wrangling the data in Power BI.

I just meant that when I was writing your solution, I had to double check that column names, due to the inconsistencies.  Nevertheless, adding the Table.Distinct and the Table.Buffer to your join clause should speed things right up!


I’m usually answering from my phone, which means the results are visualized only in my mind. You’ll need to use my answer to know that it works—but it will work!!
Anonymous
Not applicable

Thanks @watkinnc for the solution, I was able to apply it and it appears to be much quicker! However, I'm still unable to match all the data as@Greg_Deckler pointed out, why not use the product code. I'm not sure why they don't all find a match, here's a screenshot: 

PowerBI_MissingRecords.png

This is matching sales order to product list. If I flip it around the other and match product list to sales order, it appears to work but as I build my dashboard, I'm missing multiple months of sales so I know it's still missing records. I think I've fixed the performance issue but have not fixed the inablity to correctly match records using the product code field because it's a text field.

 

@Greg_Deckler  how would I convert the product code to a true number as you posted "like maybe for apples, 10000 + ASCII code for A, B, C, etc." in Power BI? I see I can convert to numbers but it fails when it hits the ones with letters. I'm still learning PBI so forgive me if this seems like a newbie question.

 

Thanks!

 

 
Greg_Deckler
Super User
Super User

Wait, why can't you just use product code?

 

Perhaps even convert product code to a true number like maybe for apples, 10000 + ASCII code for A, B, C, etc.?


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...
Anonymous
Not applicable

Exactly but it doesn't work, it stops matching records at 1022 as I posted. I suspect because it's a text field and not true unique identifier. This video seemed helpful in explaining but still not sure why the all the records don't merge together:

 

https://radacad.com/be-careful-when-merging-on-text-fields-in-power-bi-using-power-query 

 

When I relate the two queries together using "product code" and build a dashboard page with table view based on, there's no issues. But when I "merge" them in power query, it doesn't merge all the records and is painfully slow that I've left for couple hour after merging then going to my page to refresh, it never succeeds.

 

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors