Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
Anonymous
Not applicable

Can I create an efficient INNER JOIN in Power Query?

I've got two tables in my model. Sales, consisting of 1.000 sales records. Customers, consisting of 150.000 customers. As you can imagine, there are a lot of Customers without any Sales. I'd ideally only pull those customer records, that actually have sales. Is there an efficient way to do this in Power Query, without the necessity to go back to my DBA?

 

Thanks

1 ACCEPTED SOLUTION
Anonymous
Not applicable

Hey @Anonymous

 

I can't really speak to the efficiency of the INNER JOIN but if you merge the two queries using an INNER JOIN in the Query Editor and then filter out the customers with no sales, you will then load only the data for customers with sales into your data model. This will all be accomplished during report refresh so there won't be any performance or efficiency issues to worry about during runtime.

 

Hope this helps,

Parker

View solution in original post

6 REPLIES 6
Anonymous
Not applicable

Hey @Anonymous

 

I can't really speak to the efficiency of the INNER JOIN but if you merge the two queries using an INNER JOIN in the Query Editor and then filter out the customers with no sales, you will then load only the data for customers with sales into your data model. This will all be accomplished during report refresh so there won't be any performance or efficiency issues to worry about during runtime.

 

Hope this helps,

Parker

Anonymous
Not applicable

Hi Parker,

 

But will this still fetch all 150k records from the customer table, or would it write a Native query that first does the INNER JOIN?

 

Thanks for clarifying.

Anonymous
Not applicable

@Anonymous Just ran a quick test and can confirm that after performing the Merge (Inner Join), Power Query wrote the Native Query to perform the INNER JOIN inside of the SELECT statement. This means that proper query folding took place and Power BI will not have to bring in any of the unecessary customers from your data.

 

I would post a picture of the native query but probably shouldn't for confidentiality reasons, so you'll just have to take my word for it Smiley Very Happy

 

-Parker

Anonymous
Not applicable

When loading the data to Power Pivot, it does also pull in the entire customer table, besides the correctly reduced merge table. It seems that I need to disable the loading of the entire customer table, is there some kind of setting for that? Power Query is fetching the reduced merge table first, so it seems to be working directly on the DB. If I try to remove the original full table, I get an error though.

 

Thanks again.

Anonymous
Not applicable

Edited this response as I realized previous response made no sense...

 

I think you need to merge your Sales table to your Customers table. This way the native query will act on the Customers table and will not require bringing in the whole Customer table. Also, you can right click the Sales table and unselect "Enable Load" if you don't want the whole Sales table being loaded into the data model.

 

-Parker

 

 

 

Anonymous
Not applicable

Thanks for all your help, I was just a little unsure whether the JOIN would pull Customer data directly from the DB, or work with the locally cached version it pulled in earlier. It seems to work perfectly with the approach you suggested. I'll mark your answer as the solution.

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.