Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
blommethomas
Frequent Visitor

How to improve speed of lookup query

Hi,

 

i've written a Power Query function to lookup products in an product database.

The goal is to return a list of EAN codes (in text format using "," as delimiter) that fit the criteria.

The function looks like this:

blommethomas_0-1598283704037.png

All parameters are optional with the exception of "year" - meaning if you leave all of them blank, you get a list with all EAN codes in the database.

 

The code I use is the following. "Contract" is the name of the product database

// functtion to return all EAN codes that fit the selections on Category, Brand, Format, Size & Segment
// Can take a null value in which case all records are returned for that parameter
(Year, Category, Brand, Format, Size, Segment) =>
let 
    // load Contract table which has EAN codes
    Source = Contract,
    // select the correct year
    YearSelection = Table.SelectRows(Source, each [Contract Year] = Year),
    // make row selection on Category if parameter is provided else return all rows
    CategorySelection = Table.SelectRows(YearSelection, each if Category <> null then [Category] = Category else true),
    // do the same for Brand, Format, Size & Segment
    BrandSelection = Table.SelectRows(CategorySelection, each if Brand <> null then [Brand] = Brand else true),
    FormatSelection = Table.SelectRows(BrandSelection, each if Format <> null then [Format] = Format else true),
    SizeSelection = Table.SelectRows(FormatSelection, each if Size <> null then [Size] = Size else true),
    SegmentSelection = Table.SelectRows(SizeSelection, each if Segment <> null then [Segment] = Segment else true),

    // Select the Column with EAN codes
    EANColumn = Table.SelectColumns(SegmentSelection, "EAN CU"),
    // Change the type of the column from Numbers to Text
    Transformed = Table.TransformColumnTypes(EANColumn, {"EAN CU", type text}),
    // Change the table into a list
    EANlist = Table.ToList(Transformed),
    // Change list into string
    EANs = Text.Combine(EANlist,",")
in EANs

 

The output data is afterwards used to populate a PowerBi dashboard.

Until recently it was working fine, but now it "hangs" when refreshing. Probably because the queries are becoming too big.

To get a sense of the size of the dataset:

  • I use this function +/- 1000 times during a data transformation action in another query
  • The "Contract" database contains +/- 5000 entries

 

Unfortunately I can't share the datasets because they contain proprietary data.

 

Any ideas on how i can make this query more efficient?

 

Thomas

 

1 ACCEPTED SOLUTION

Hi @blommethomas ,

agree on what @edhans said.

In addition to that I believe the biggest improvement on performance you can make is to change the conditional row selection.
From this

CategorySelection = Table.SelectRows(YearSelection, each if Category <> null then [Category] = Category else true)

to this

CategorySelection = if Category <> null then Table.SelectRows(YearSelection, each [Category] = Category ) else YearSelection

 

In your version, the condtion is checked for every row in your table. However, in my version just once.

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

View solution in original post

3 REPLIES 3
edhans
Super User
Super User

Take a look at this article by @ImkeF on partitioning your data. Essentially you want to group your data before doing a lot of operations on it. That way, Power Query is working on a subset of data each time. For example, if you grouped by year and kept only the relevant row, Table.SelectRows() only operates on a few records rather than thousands.

 

You may also want to look at the order you are segmenting data. Optimize your code so the earliest filters remove the biggest chunks of data. For example, if you have 3 years of data, your Year filter is only removing on average 2/3 of the data. But if your Segment field as 50 values, if you start with that, you initially whack 49/50ths of the data.

 

All of that being said, step back and see if this is where you need to do it. DAX is extremely efficient at doing this type of operation, so why not bring everything in to the DAX model and filter there? Or, if your data is in SQL Server or any other source that is foldable, are your queries having the server do the work? I suspect if it will fold, your custom function is breaking that and that is causing the mashup engine to do all of the work. If it is a flat file or source that doesn't fold, then this doesn't really matter.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

Hi @blommethomas ,

agree on what @edhans said.

In addition to that I believe the biggest improvement on performance you can make is to change the conditional row selection.
From this

CategorySelection = Table.SelectRows(YearSelection, each if Category <> null then [Category] = Category else true)

to this

CategorySelection = if Category <> null then Table.SelectRows(YearSelection, each [Category] = Category ) else YearSelection

 

In your version, the condtion is checked for every row in your table. However, in my version just once.

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Thanks a lot @edhans and @ImkeF  for the swift reply.

 

I've made the change Imke suggested and indeed, it makes a massive difference.

I'll also have a look at the order of the selection queries in order to make the function even faster.

 

@edhans I agree that I probably should try doing this in DAX.

However, I tried that first and really struggled with the language syntax, so sticking to PowerQuery solution for now, because it works 🙂

 

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors