Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
AlB
Super User
Super User

Optimizing measure - Segment migration

Hi all,

See the attached file for support.

We have a simple one-table model:

CustomerId Dates Segment
6 03/01/2020 Segment2
7 03/01/2020 Segment2
6 03/02/2020 Segment1
7 03/02/2020 Segment2
6 03/03/2020 Segment1
7 03/03/2020 Segment1

 

with info on what segment customers belonged to at specific dates. We build a matrix visual with Segment in rows, Dates in columns and this measure in values:

 

Net new customers = 
VAR _CurrentDate =
    SELECTEDVALUE (Table1[Dates] )
VAR _PreviousDate =
    CALCULATE (
        MAX ( Table1[Dates] ),
        ALLSELECTED ( Table1[Dates] ),
        Table1[Dates] < _CurrentDate
    )
VAR _CustomerIDsThisMonth = DISTINCT(Table1[CustomerId]) 
VAR _CustomerIDsPreviousMonth =
    CALCULATETABLE (
        DISTINCT(Table1[CustomerId]),
        Table1[Dates] = _PreviousDate
    )
RETURN
    COUNTROWS(EXCEPT(_CustomerIDsThisMonth, _CustomerIDsPreviousMonth))

 

This provides the number of new customers that moved to that segment at that date (from the previous date).

This works fine for low number of customers. When that increases, however, it gets quite slow. For instance, with 1 million customers and info on 10 dates in the table, the measure takes around 13 secs to execute in the visual. If the number of customers and dates is higher, like in the real scenario, it is much slower still.

Any ideas on how to optimize this, make it faster?

Note that you can change the size of the table by changing the number of customers and number of periods (dates) in the query editor. Simply edit the parameters NumCustomers and NumPeriods.

@mahoneypat , @MFelix @TomMartens 

 

 

1 ACCEPTED SOLUTION

 

Hi @AlB ,

 

try this.

Net new customers v2 = 
VAR _CurrentDate =
    SELECTEDVALUE (Table1[Dates] )
VAR _PreviousDate =
    CALCULATE (
        MAX ( Table1[Dates] ),
        ALLSELECTED ( Table1[Dates] ),
        Table1[Dates] < _CurrentDate
    )
VAR _CountCustomerIDsThisMonth =
    CALCULATE( 
        DISTINCTCOUNT(Table1[CustomerId]),
        Table1[Dates] = _PreviousDate ||
        Table1[Dates] = _CurrentDate
    )
VAR _CountCustomerIDsPreviousMonth =
    CALCULATE( 
        DISTINCTCOUNT(Table1[CustomerId]),
        Table1[Dates] = _PreviousDate
    )
RETURN
    _CountCustomerIDsThisMonth - _CountCustomerIDsPreviousMonth

 

 

 

Did I answer your question?
Please mark my post as solution, this will also help others.
Please give Kudos for support.

Marcus Wegener works as Full Stack Power BI Engineer at BI or DIE.
His mission is clear: "Get the most out of data, with Power BI."
twitter - LinkedIn - YouTube - website - podcast


View solution in original post

11 REPLIES 11
AlB
Super User
Super User

@mwegener 

Thanks so much for your continued efforts. V5 shows a bit of an improvement on 1M customers, but not really significant.

- [ Net new customers purchases V5]: 21.5 secs

Perhaps we've reached the optimal approach without using PQ or changing the model?

Thanks

 

This is quite possible.

Did I answer your question?
Please mark my post as solution, this will also help others.
Please give Kudos for support.

Marcus Wegener works as Full Stack Power BI Engineer at BI or DIE.
His mission is clear: "Get the most out of data, with Power BI."
twitter - LinkedIn - YouTube - website - podcast


AlB
Super User
Super User

@mwegener 

Thanks for V4. Smart alternative.  It doesn't seem to run faster, I'm  afraid. These are the readings in DAX Studio with 1M customers:

- [ Net new customers purchases]:     22.5 secs   

- [ Net new customers purchases V2 ] : 25.7 secs

- [ Net new customers purchases V4 ]:  23.7 secs

 

So no massive differences. V4 actually introduces a Callback Data at the end but that doesn't seem to have a big effect. Earlier I claimed [ Net new customers purchases V2 ] was twice as fast as  [ Net new customers purchases] but i cannot reproduce that now. Perhaps I made a mistake measuring, it was with the Performance analyzer with PBI. Dax Studio should be more reliable

Thanks

 

 

 

 

 

Hi @AlB ,

 

try this.

Net new customers purchases V5 =
VAR _CurrentDate =
    SELECTEDVALUE ( Table1[Dates] )
VAR _PreviousDate =
    CALCULATE ( MAX ( Table1[Dates] ), Table1[Dates] < _CurrentDate )
VAR _NewCustomer =
    FILTER (
        VALUES ( Table1[CustomerId] ),
        VAR _CountPreviousCustomer =
            CALCULATE ( COUNT ( Table1[CustomerId] ), Table1[Dates] = _PreviousDate )
        VAR _IsInPreviousDate =
            _CountPreviousCustomer = BLANK ()
        RETURN
            _IsInPreviousDate
    )
RETURN
    CALCULATE ( SUM ( Table1[Sales] ), _NewCustomer )
Did I answer your question?
Please mark my post as solution, this will also help others.
Please give Kudos for support.

Marcus Wegener works as Full Stack Power BI Engineer at BI or DIE.
His mission is clear: "Get the most out of data, with Power BI."
twitter - LinkedIn - YouTube - website - podcast


AlB
Super User
Super User

@mwegener 

I'm primarily interested in a DAX solution but if it cannot be optimized further in DAX PQ is an option too

 

AlB
Super User
Super User

@mwegener  (and everyone else)

Allow me to challenge you a bit further. See attached file for support.

We now want to calculate the revenue (sales) that the new customers that moved to the segment have generated. For that we have added an extra column (Table1[Sales]). We then have an initial approach, similar to [Net new customers]. Very slow as well:

 

Net new customers purchases = 
VAR _CurrentDate = SELECTEDVALUE ( Table1[Dates] )
VAR _PreviousDate = CALCULATE ( MAX ( Table1[Dates] ), Table1[Dates] < _CurrentDate )
VAR _CustomerIDsThisMonth = DISTINCT ( Table1[CustomerId] )
VAR _CustomerIDsPreviousMonth =
    CALCULATETABLE (
        DISTINCT ( Table1[CustomerId] ),
        Table1[Dates] = _PreviousDate
    )
RETURN
    CALCULATE (
        SUM ( Table1[Sales] ),
        EXCEPT ( _CustomerIDsThisMonth, _CustomerIDsPreviousMonth )
    )

 

Following an approach similar to @mwegener 's V2 (although it cannot be followed completely in this case), we can make it twice as fast, but still relatively slow:

 

Net new customers purchases V2 = 
VAR _CurrentDate = SELECTEDVALUE ( Table1[Dates] )
VAR _PreviousDate = CALCULATE ( MAX ( Table1[Dates] ), Table1[Dates] < _CurrentDate )
VAR _CustomerIDsThisMonth =
    CALCULATETABLE (
        DISTINCT ( Table1[CustomerId] ),
        Table1[Dates] IN  {_PreviousDate , _CurrentDate}
    )
VAR _CustomerIDsPreviousMonth =
    CALCULATETABLE (
        DISTINCT ( Table1[CustomerId] ),
        Table1[Dates] = _PreviousDate
    )
RETURN
    CALCULATE ( SUM ( Table1[Sales] ), _CustomerIDsThisMonth )
        - CALCULATE ( SUM ( Table1[Sales] ), _CustomerIDsPreviousMonth )

 

Can you make it faster?

Many thanks

Hi @AlB ,

try this.

 

Net new customers purchases V4 = 
VAR _CurrentDate =
    SELECTEDVALUE ( Table1[Dates] )
VAR _PreviousDate =
    CALCULATE ( MAX ( Table1[Dates] ), Table1[Dates] < _CurrentDate )
RETURN
    SUMX (
        Table1,
        IF (
            CALCULATE (
                COUNTROWS ( FILTER ( Table1, Table1[Dates] = _PreviousDate ) ),
                REMOVEFILTERS ( Table1[Dates] )
            ) >= 1,
            BLANK (),
            Table1[Sales]
        )
    )

 

 

 

Did I answer your question?
Please mark my post as solution, this will also help others.
Please give Kudos for support.

Marcus Wegener works as Full Stack Power BI Engineer at BI or DIE.
His mission is clear: "Get the most out of data, with Power BI."
twitter - LinkedIn - YouTube - website - podcast


Hi @AlB,

can we use Power Query?

 

Did I answer your question?
Please mark my post as solution, this will also help others.
Please give Kudos for support.

Marcus Wegener works as Full Stack Power BI Engineer at BI or DIE.
His mission is clear: "Get the most out of data, with Power BI."
twitter - LinkedIn - YouTube - website - podcast


AlB
Super User
Super User

Hi @mwegener 

Many thanks for your responses. How didn't I think of this approach 🙄 ?? V2 is way way faster. Great!

Interestingly enough, though, V3 is slower than the original one 🤔 which I wouldn't expect seeing V2's performance 

 

 

 

Hi @AlB ,

 

try this.

Net new customers v2 = 
VAR _CurrentDate =
    SELECTEDVALUE (Table1[Dates] )
VAR _PreviousDate =
    CALCULATE (
        MAX ( Table1[Dates] ),
        ALLSELECTED ( Table1[Dates] ),
        Table1[Dates] < _CurrentDate
    )
VAR _CountCustomerIDsThisMonth =
    CALCULATE( 
        DISTINCTCOUNT(Table1[CustomerId]),
        Table1[Dates] = _PreviousDate ||
        Table1[Dates] = _CurrentDate
    )
VAR _CountCustomerIDsPreviousMonth =
    CALCULATE( 
        DISTINCTCOUNT(Table1[CustomerId]),
        Table1[Dates] = _PreviousDate
    )
RETURN
    _CountCustomerIDsThisMonth - _CountCustomerIDsPreviousMonth

 

 

 

Did I answer your question?
Please mark my post as solution, this will also help others.
Please give Kudos for support.

Marcus Wegener works as Full Stack Power BI Engineer at BI or DIE.
His mission is clear: "Get the most out of data, with Power BI."
twitter - LinkedIn - YouTube - website - podcast


HI @AlB ,

 

depending on the underlying data, this adjustment could also help.

Net new customers v3 = 
VAR _CurrentDate =
    SELECTEDVALUE (Table1[Dates] )
VAR _PreviousDate =
    CALCULATE (
        MAX ( Table1[Dates] ),
        ALLSELECTED ( Table1[Dates] ),
        Table1[Dates] < _CurrentDate
    )
VAR _CountCustomerIDsThisMonth =
    CALCULATE( 
        SUMX(VALUES(Table1[CustomerId]),1),
        Table1[Dates] = _PreviousDate ||
        Table1[Dates] = _CurrentDate
    )
VAR _CountCustomerIDsPreviousMonth =
    CALCULATE( 
        SUMX(VALUES(Table1[CustomerId]),1),
        Table1[Dates] = _PreviousDate
    )
RETURN
    _CountCustomerIDsThisMonth - _CountCustomerIDsPreviousMonth

 

Got it from the Guy in a Cube video "Debugging a slow Power BI report with Phil Seamark"

https://www.youtube.com/watch?v=eABg872TAJU

 

 

Did I answer your question?
Please mark my post as solution, this will also help others.
Please give Kudos for support.

Marcus Wegener works as Full Stack Power BI Engineer at BI or DIE.
His mission is clear: "Get the most out of data, with Power BI."
twitter - LinkedIn - YouTube - website - podcast


Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.