Reply
Regular Visitor
Posts: 20
Registered: ‎01-06-2016
Accepted Solution

Lost Customers - Issues with DAX Calculation

[ Edited ]

I'm having an issue with a DAX calc for lost customers and I'm hoping someone can spot where we went wrong. I have followed the pattern described in http://www.daxpatterns.com/new-and-returning-customers/ to get lost customers, but it is returning unexpected results. I'm hoping there is just a mistake in my formula that I am overlooking. I'm curious if anyone else has tried this and experienced a similar issue.

 

I have tables for Customer, Date, Sales, which are relevant here. Sales are at the invoice line level, and each line is related to customer by CustomerKey (surrogate key to a type 2 SCD) and date through InvoiceDateKey (surrogate key to date dim). The InvoiceDate field was added as a calculated column to the Sales table.

 

We are defining lost customers as any customer who hasn't made a purchase in 60 consecutive days ending in the selected timeframe. I have a customer that had purchases on 18-Dec-15, 29-Jan-16, and 12-Feb-16, but is somehow being counted as a lost customer in February. The only thing I noticed about them is that the Dec sales records are related to a different customer key than the Jan and Feb sales because someone corrected the customer name. I don't think this should happen since I'm counting unique [Customer No] values rather than [CustomerKey] values. 

 

Here's my formula: 

Lost Customers :=
IF (
    NOT (
        MIN ( 'Date'[Full Date] )
            > CALCULATE ( MAX ( Sales[Invoice Date] ), ALL ( Sales ) )
    ),
    COUNTROWS (
        FILTER (
            ADDCOLUMNS (
                FILTER (
                    CALCULATETABLE (
                        ADDCOLUMNS (
                            VALUES ( Sales[CustomerNO] ),
                            "CustomerLostDate", CALCULATE ( MAX ( Sales[Invoice Date] ) ) + [Lost Days Limit]
                        ),
                        FILTER (
                            ALL ( 'Date' ),
                            AND (
                                'Date'[Full Date] < MIN ( 'Date'[Full Date] ),
                                'Date'[Full Date]
                                    >= MIN ( 'Date'[Full Date] ) - [Lost Days Limit]
                            )
                        )
                    ),
                    AND (
                        AND (
                            [CustomerLostDate] >= MIN ( 'Date'[Full Date] ),
                            [CustomerLostDate] <= MAX ( 'Date'[Full Date] )
                        ),
                        [CustomerLostDate] <= CALCULATE ( MAX ( Sales[Invoice Date] ), ALL ( Sales ) )
                    )
                ),
                "FirstBuyInPeriod", CALCULATE ( MIN ( Sales[Invoice Date] ) )
            ),
            OR ( ISBLANK ( [FirstBuyInPeriod] ), [FirstBuyInPeriod] > [CustomerLostDate] )
        )
    )
)

 

 Any ideas where my issue is? 

 

Edit: 

 

Here are the results in a pivot table. Notice the first pivot has only Customer No and seems to work. The second includes customer name and provides unexpected results.

LostCust.jpg


Accepted Solutions
Highlighted
Member
Posts: 67
Registered: ‎06-24-2015

Re: Lost Customers - Issues with DAX Calculation

Good point - you have to remove some filter from the CustomerLostDate column expression.

Try this one replacing the ADDCOLUMNS in original formula:

ADDCOLUMNS (
    CALCULATETABLE ( VALUES ( Customer[CustomerCode] ), Sales ),
    "CustomerLostDate", CALCULATE (
        MAX ( Sales[Invoice Date] ),
        ALLEXCEPT ( Customer, Customer[CustomerCode] )
    )
        + [Lost Days Limit]
)

View solution in original post


All Replies
Super User
Posts: 1,548
Registered: ‎07-03-2015

Re: Lost Customers - Issues with DAX Calculation

I have followed the pattern and it worked for me.  The only obvious thing I see is that you have an extract IF statement at the start. Have you tried it without this?

* Matt is a Microsoft MVP (Power BI) and author of the Power BI Book Supercharge Power BI.
Regular Visitor
Posts: 20
Registered: ‎01-06-2016

Re: Lost Customers - Issues with DAX Calculation

I removed it and got the same results. Thanks for the suggestion, though.

 

I'm thinking that it has to do with the customer dimension being a type 2 SCD. The Dec sales are on one customer key (surrogate key) and the Jan and Feb sales are on another. I don't really understand how that could affect this since I never use the customer key anywhere, and the customer number (business key) is the same for all the customer keys for that customer. 

Super User
Posts: 1,548
Registered: ‎07-03-2015

Re: Lost Customers - Issues with DAX Calculation

So you have 2 IDs for the customer?  That sounds suspicious.  I think (eg I am not 100% sure but give it a try) that your dim table needs to connect to a primary key for the customer, and then use the same primary key for all your calcs.  If I understand you correctly (which I may not be), you are joining on one key (SCD Surrogate) and then doing the calc on a different ID.  If this is what you are doing, I am thinking this could be the culprit.

* Matt is a Microsoft MVP (Power BI) and author of the Power BI Book Supercharge Power BI.
Regular Visitor
Posts: 20
Registered: ‎01-06-2016

Re: Lost Customers - Issues with DAX Calculation

Slowly changing dimensions (SCDs) are a common data warehousing practice used to capture history of what an entity looked like at a particular time.  http://www.kimballgroup.com/2008/09/slowly-changing-dimensions-part-2/

 

I cannot connect to just the customer number (the natural/business key) because there are multiple rows in my customer table that have this key. The customer table is tied to the sales fact table via the surrogate key (an autoincremented number that provides no real value other than to uniquely identify a row and serve as a primary key for the dimension table). So I cannot simply count customer keys because that would result in incorrect numbers. To count the number of lost/recovered/new/any customers, I must do a distinct count on the customer number, which would normally be the same as the customer key if I weren't keeping history (and is the same in the DAX pattern).   

 

So basically, I'm trying to figure out how to alter the pattern to work with a slowly changing dimension. 

Super User
Posts: 1,548
Registered: ‎07-03-2015

Re: Lost Customers - Issues with DAX Calculation

Yes I know what a SCO is and I know why they exist.  My point is that I think this is causing the issue.  I believe the lost customer formula expects the dim table to have 1 record per customer and not multiple (best guess).  @marcorusso would have to tell you for sure.  

 

I am not saying it is right or wrong, but when I create models, I always take the last view of the customer as my dim table.  If I need a historical view of the customer then I assume you can just load a fact table with the historical records for those customers that have changed and use DAX to extract the historical data.  

 

Once again, this is just an assumption and hopefully Marco can give the definitive answer. 

* Matt is a Microsoft MVP (Power BI) and author of the Power BI Book Supercharge Power BI.
Member
Posts: 67
Registered: ‎06-24-2015

Re: Lost Customers - Issues with DAX Calculation

If you have an SCD2, instead of

VALUES ( Sales[CustomerNO] )

you should use

CALCULATETABLE ( VALUES ( Customer[CustomerCode] ), Sales )

 

where CustomerCode is the key for the customer that is the same across all the versions (surrogate keys) of the customer itself.

 

Please note this will slow down the performance and it is the reason why we used the column in Sales.

A best practice is to denormalize the customer code in the fact table, just to use that in this calculation.

 

Please note that if you are using Excel 2016 or Power BI Desktop, it is possible to write the same pattern using the new set functions (INTERSECT, EXCEPT, UNION) in a much faster way. We'll update the patterns for these versions, but I'm worried we'll not have time until this autumn.

 

Regular Visitor
Posts: 20
Registered: ‎01-06-2016

Re: Lost Customers - Issues with DAX Calculation

[ Edited ]

Thank you, Marco. That gives me the correct answer if I only have the Customer No in a pivot table. If I include the Customer Name in the pivot table (this is the attribute that changed over time), it counts them as lost. Is there a way to alter the calculation to change this behavior so I get consistent results in both pivots? 

 

For example, I have a customer who bought something in December and then their name changed, then they had purchases in January and February. 

  

Customer 
CustomerKey Customer No Customer Name EffectiveStart EffectiveEnd RowIsCurrent
1           1           Person A      1/1/1900 1     2/31/2015    0
2           1           Person A2     1/1/2016                    1
SalesInvoice 
InvoiceDateKey InvoiceDate CustomerKey ProductKey SalesAmount
20151218       12/18/2015  1           3          10
20160129       1/29/2016   2           4          15
20160216       2/12/2016   2           5          20

Their pivot with just Customer No looks good (they are not counted as lost). But if I include Customer Name after the Customer Number, they are counted as lost in February, despite the fact that I am not using the CustomerKey (surrogate key) anywhere in the formula.

LostCust.jpg 

Highlighted
Member
Posts: 67
Registered: ‎06-24-2015

Re: Lost Customers - Issues with DAX Calculation

Good point - you have to remove some filter from the CustomerLostDate column expression.

Try this one replacing the ADDCOLUMNS in original formula:

ADDCOLUMNS (
    CALCULATETABLE ( VALUES ( Customer[CustomerCode] ), Sales ),
    "CustomerLostDate", CALCULATE (
        MAX ( Sales[Invoice Date] ),
        ALLEXCEPT ( Customer, Customer[CustomerCode] )
    )
        + [Lost Days Limit]
)
Regular Visitor
Posts: 20
Registered: ‎01-06-2016

Re: Lost Customers - Issues with DAX Calculation

Thanks so much, Marco. That did the trick. For anyone else who happens to need it, here is my full DAX calculation: 

Lost Customers:=IF (
    NOT (
        MIN ( 'Date'[Full Date] )
            > CALCULATE ( MAX ( Sales[Invoice Date] ), ALL ( Sales ) )
    ),
    COUNTROWS (
        FILTER (
            ADDCOLUMNS (
                FILTER (
                    CALCULATETABLE (
                        ADDCOLUMNS (
                            CALCULATETABLE ( VALUES ( Customer[Customer No] ), Sales ),
                            "CustomerLostDate", CALCULATE (
                                MAX ( Sales[Invoice Date] ),
                                ALLEXCEPT ( Customer, Customer[Customer No] )
                            )
                                + [Lost Days Limit]
                        ),
                        FILTER (
                            ALL ( 'Date' ),
                            AND (
                                'Date'[Full Date] < MIN ( 'Date'[Full Date] ),
                                'Date'[Full Date]
                                    >= MIN ( 'Date'[Full Date] ) - [Lost Days Limit]
                            )
                        )
                    ),
                    AND (
                        AND (
                            [CustomerLostDate] >= MIN ( 'Date'[Full Date] ),
                            [CustomerLostDate] <= MAX ( 'Date'[Full Date] )
                        ),
                        [CustomerLostDate] <= CALCULATE ( MAX ( Sales[Invoice Date] ), ALL ( Sales ) )
                    )
                ),
                "FirstBuyInPeriod", CALCULATE ( MIN ( Sales[Invoice Date] ) )
            ),
            OR ( ISBLANK ( [FirstBuyInPeriod] ), [FirstBuyInPeriod] > [CustomerLostDate] )
        )
    )
)