Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
rks
Resolver I
Resolver I

Very Slow Nested Iterator

Hi Community.

 

I have the following scenario:

* I have a product table containing 15 million products

* I have a stock table (a billion or so rows) with a relationship to the product

* The stock table also contains other references such as partno, storage location, etc.

 

Now, a product could also be a big cupboard consisting of several pieces. Each piece has got a partno and each partno has a quantity.

Let's say I have a wooden-cupboard with 3 parts. 

Product table: 1 row with attributes such as itemno, description, category and so on

Stock Table: one row per part, aggregated over partno it would look like:
Partno 1: 5 rows

Partno 2: 5 rows

Partno3: 6 rows

 

The business wants to know: how many products can I sell given my current stock situation. The answer in this case: I can sell the wooden-cupboard 5 times.

 

I have authored a measure which produces the correct results:

 

SUMX (
                'Filter Product',
                MINX (
                    VALUES ( 'Filter Parts'[PartNo] ),
                    CALCULATE ( COUNTROWS ( 'Stock' ) )
                )
            )

 

 

The result is a number which represents the number of "complete" items on stock.

 

However, it's slow, very slow; it runs 2 minutes to produce the results which most of the load in FE ( around 80%). How can I move the payload to SE?

 

Thank you and best regards

Konstantin 

 

1 ACCEPTED SOLUTION
Anonymous
Not applicable

Your formula will be faster if you create an aggregation table that will squeeze the 'Stock' table so that you don't have to count rows for PartNo in CALCULATE ( COUNTROWS ('Stock') ) but use the number of rows directly. So, you'd have a table aggStock which would be created by appropriately agrouping rows and adding one more column that would tell you the number of rows in the grouping (aggStock[PartCount]). Then the measure would be:

 

sumx(
    'Products',
    calculate( minx( 'aggStock'[PartCount] ) )
)

 

In aggStock you'd have the same columns as in Stock but it would not be as granular, so that you don't have to count the rows. This is the first speed-up that comes to my mind...

View solution in original post

8 REPLIES 8
Anonymous
Not applicable

Your formula will be faster if you create an aggregation table that will squeeze the 'Stock' table so that you don't have to count rows for PartNo in CALCULATE ( COUNTROWS ('Stock') ) but use the number of rows directly. So, you'd have a table aggStock which would be created by appropriately agrouping rows and adding one more column that would tell you the number of rows in the grouping (aggStock[PartCount]). Then the measure would be:

 

sumx(
    'Products',
    calculate( minx( 'aggStock'[PartCount] ) )
)

 

In aggStock you'd have the same columns as in Stock but it would not be as granular, so that you don't have to count the rows. This is the first speed-up that comes to my mind...

I accept the change in a data model as a solution. Obviously the nested iterator in itself is not to be optimized... 

Anonymous
Not applicable

By the way, you don't have to use MINX. You can do:

sumx(
    'Products',
    calculate( min( 'aggStock'[PartCount] ) )
)

However, under the hood MIN is always MINX:

min( T[Col] ) = minx( T, T[Col] )
Anonymous
Not applicable

Can you tell us what kind of performance boost you've got from the suggestion? How have the runtimes changed?
rks
Resolver I
Resolver I

@Anonymous @AlB Thank you for looking into this and the feedback for more details. I have created a quick sample file with this simplified data model:

rks_1-1603260665074.png

 

* Product contains the product's master data with about 13 million rows in the original model

* PartAgg is an aggregated table. In the real scenario each item on stock has a distinctive ID, the model would grow quickly to several hundred million rows in a dimension without aggregating. Of interest is the partNo and one or two other filterable columns. It contains around 200 rows.

* StorageLocation depicts the physical location of each item. The is just to illustrate that there are several other dimensions on a higher grain related to Stock (~5 million rows)

* Stock is a snapshot once per day with a snapshot date. The fact table contains up to a billion rows

 

Download here: http://s000.tinyupload.com/?file_id=57163484993199174102

@rks 

You cannot upload the file here directly. You have to share the URL to the file hosted elsewhere: Dropbox, Onedrive... or just upload the file to a site like tinyupload.com (no sign-up required).

Please mark the question solved when done and consider giving a thumbs up if posts are helpful.

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

Cheers 

SU18_powerbi_badge

Anonymous
Not applicable

You should give more details. Especially post a picture of the model itself.
AlB
Super User
Super User

Hi @rks 

You talk about two tables but your code refers to three: 'Filter Product', 'Filter Parts'  and 'Stock'.  Can you explain a bit more and show the relationships between the tables (what fields are involved)?

Please mark the question solved when done and consider giving kudos if posts are helpful.

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

Cheers 

 

SU18_powerbi_badge

 

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors