cancel
Showing results for 
Search instead for 
Did you mean: 

Performance Tuning DAX - Part 2

Introduction

First part is here: https://community.powerbi.com/t5/Community-Blog/Performance-Tuning-DAX-Part-1/ba-p/976275. I suggest you read it first because otherwise this won't make a lot of sense.

 

For those of you who didn't already cheat, download the PBIX and skip ahead...

 

So, continuing where we left off in Part 1, of course I can't leave well enough alone, I really wanted to get this thing down to a sub-second calculation time. Spoiler, no such luck, but here goes:

Step 9

Looking at the code, the SUMX bothers me. SUMX is an iterator function and thus strikes me as potentially slowing things down. Let's get rid of it:

 

Total Orders 10 = 
    VAR MinDateInContext = MIN ( 'DateTimeTable'[Date] )
    VAR MaxDateInContext = MAX ( DateTimeTable[Date] )
    VAR __Table =
        FILTER(
            'Tracking History',
            AND (
                OR (  
                    'Tracking History'[Start Date] > MinDateInContext, 
                    'Tracking History'[End Date] > MaxDateInContext
                ),
                MaxDateInContext > 'Tracking History'[Start Date]
            )
        )
RETURN
    COUNTROWS(FILTER(__Table,[Start Date] > MinDateInContext || [End Date] > MaxDateInContext))

 

Performance analyzer results were as follows:

  • Total Orders 10, 22,498 milliseconds, 22 seconds

OK, basically no difference, guess the iterator functions aren't so evil after all...

Step 10

Fine, now that everything is a FILTER, let's just consolidate everything:

 

Total Orders 11 = 
    VAR MinDateInContext = MIN ( 'DateTimeTable'[Date] )
    VAR MaxDateInContext = MAX ( DateTimeTable[Date] )
RETURN
        COUNTROWS(
            FILTER(
                FILTER(
                    'Tracking History',
                    AND (
                        OR (  
                            'Tracking History'[Start Date] > MinDateInContext, 
                            'Tracking History'[End Date] > MaxDateInContext
                        ),
                        MaxDateInContext > 'Tracking History'[Start Date]
                    )
                ),
                OR (
                    [Start Date] > MinDateInContext,
                    [End Date] > MaxDateInContext
                )
            )
        )

 

Performance analyzer results were as follows:

  • Total Orders 11, 21,996 milliseconds, 22 seconds

Nothing. Hmm. Maybe we are running out of optimizations.

Step 11

So our current logic in the filters is as follows:

 

 ( (A || B) && C ) && (A || B)

 

Knowing how to reduce and simplify logic equations, we can rewrite this to:

 

(A && C) || (B & C)

 

So, let's do that:

 

Total Orders 12 = 
    VAR MinDateInContext = MIN ( 'DateTimeTable'[Date] )
    VAR MaxDateInContext = MAX ( DateTimeTable[Date] )
RETURN
    COUNTROWS(
        FILTER(
            'Tracking History',
            OR (
                AND (
                    'Tracking History'[Start Date] > MinDateInContext,
                    MaxDateInContext > 'Tracking History'[Start Date]
                ),
                AND (
                    'Tracking History'[End Date] > MaxDateInContext,
                    MaxDateInContext > 'Tracking History'[Start Date]
                )
            )
        )
    )

 

Performance analyzer results were as follows:

  • Total Orders 12, 31,413 milliseconds, 31 seconds

Ahhh!!! We broke it!! That last optimization was one too far! I ran this test multiple, multiple times, it always came back in the 30 second range whereas Total Orders 11 always came back in the 20-22 second range. Drat!

Step 12

OK, let's keep the same basic idea but go back to multiple filters so that we are filtering early!

 

Total Orders 13 = 
    VAR MinDateInContext = MIN ( 'DateTimeTable'[Date] )
    VAR MaxDateInContext = MAX ( DateTimeTable[Date] )
RETURN
        COUNTROWS(
            FILTER(
                FILTER(
                    FILTER (
                        'Tracking History',
                            OR (  
                                'Tracking History'[Start Date] > MinDateInContext, 
                                'Tracking History'[End Date] > MaxDateInContext
                            )
                    ),
                       MaxDateInContext > 'Tracking History'[Start Date]
                ),
                OR (
                    [Start Date] > MinDateInContext,
                    [End Date] > MaxDateInContext
                )
            )
        )

 

Performance analyzer results were as follows:

  • Total Orders 13, 21,215 milliseconds, 21 seconds

OK, back down to 20 seconds again!! Whew!!

Conclusion

So, unlike Part 1 where we took the calculation down from 10 minutes to 20 seconds in Part 2 we have successfully managed to more or less waste our time. 20 seconds to 20 seconds. What have we learned in Part 2?

  • Obviously nothing about better performance.
  • Know when to quit. When multiple optimizations start to result in little or no performance gain, you're probably done
  • It is possible to do things that you consider to be optimizations that actually deoptimize performance.
  • Clean code is not necessarily performance optimized code. Total Orders 12 is cleaner code, but performs worse
  • Again, the pattern of filtering early saved the day with Total Orders 13! Nested IF's bad, nested FILTER good

Hmm, look at that, five learning bullet points for Part 2 versus six for Part 1. For all of you people over the years that have told me that "you learn more from failure than from success". You are all liars. 😁

Comments

This one

 

Total Orders 14 = 
VAR MinDateInContext = MIN ( 'DateTimeTable'[Date] )
VAR MaxDateInContext = MAX ( DateTimeTable[Date] )
RETURN
COUNTROWS(
    filter(
        FILTER(
            'Tracking History',
            'Tracking History'[Start Date] < MaxDateInContext
        ),
        MinDateInContext < 'Tracking History'[Start Date]
        ||
        MaxDateInContext < 'Tracking History'[End Date]        
    )
)

 

is about 16% faster than Total Orders 13.

 

Best

D

@darlove  - I can confirm this. For comparison with article, on my machine this ran in 17,301 milliseconds or 17 seconds. So, you're maybe not giving yourself enough credit, it's close to a 20% improvement I think!

 

Any thoughts as to why moving everything into a single measure had the greatest impact?