cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
charleshale Regular Visitor
Regular Visitor

Advanced Email Subscriber Churn Rate - Taking into Account Inactivity and Cohorts

Hi.  While I've mastered most of the basics in DAX and M Query for churn, there's a scenario that I find quite interesting that I wanted to see if people had feedback for.

 

Consider a churn analysis for a typical large (30m monthly unique visitor) publisher's email subscriber list - snapshot below:

 

image.png

 

Here's what these columns represent and why they should be all that's needed to determine churn:

 

1. Profile ID represents a unique subscriber (in this case it's just a hashed email address).  For DAX churn calcs, we'll basically be counting these under various contexts.

2. Profile Created Date in this instance determines cohort since this is the date someone joins.   What I'm going to suggest below is a churn metric that uses a median of the churn rate across certain cohorts.

3. Engagement status.   The most basic level of churn happens when someone goes (i) "hardbounce" or (ii) "optout".  I call this "Hard" Churn.   The Latest Interaction date captures the date of this event.

4.  (and here's the hardest part) Latest Interaction.   Most subscribers in these email lists never hardbounce or optout and yet need to be counted as what I call "Soft" Churn because they disengage to the point where don't visit and need to be counted as churn.    

 

Accordingly, does anyone have any suggestions on how I'd write a DAX query to get the following: 

 

A.  Median 28 Day "Hard" Churn for the Prior 12 Months:  taking into memory for each of the last 12 fully completed monthly cohorts (ie since it's now Jan 2019, we would look at the cohorts of users who signed up during 2018-01, 2018-02 .... 2018-12), calculating the percent of Profile ID's that "hard churn" -- defined as "hardbounce" and "optout" -- in each cohort, and from that creating a measure of the median of  those 12 churn percentages.

 

(note I've done this -- harder part follows below)

 

B.  Setting up the 28 day period as a variable or parameter so that we can look at churn over any number of days while automatically counting only the last 12 periods that are fully completed given that number of days.  In other words, let's say we want to change 28 Day Hard Churn to 90 Day Hard Churn, the 12 cohorts analyzed need to be 90 days earlier (2017-09, 2017-10 .... 2018-09) because a cohort will only be measurable for churn purposes if its last day is older than the # of days history being looked at.  


(and here's the hardest part)

C. Adding "Soft" Churn" to #1 and #2, above.     "Soft" Churn is someone who should be counted as having churned because they havent visited the site in over X days (let's use 120 days for now - although ideally this is Parameter).  This is the part that has made my brain hurt the most because the DAX needs (I think) to draw a variable based the difference between a fixed date that the numbers were drawn (ie 1/20/2019) and Latest Interaction to see if the Profile ID should be deemed churned due to inactivity.  I have done this manually through tremendous amounts of calculated columns.  Do any DAX geniuses know how to do this with measures from just the 4 primary columns, above?

 

Note: since anyone interested in this is probably interested in what we see as churn, where's what the numbers look like for us (based on my more manual and more M-Query calcs):   see chart below: we typically see hard churn ~16% in the first month of any cohort, then an additional 6% in month 2, then about another 20% over the rest of the year for ~40% in the first year if we count 6 month inactives as having churned.   6 month inactives come back only about 5% of the time.

 

Annotation 2019-01-28 151142.jpg\

If I were super-skilled DAX, ideally I'd be able to show a stacked area chart for the median of any given set of cohorts where the Y axis is % churn, the X axis is time, and the stacked areas are "hard churn" and "soft churn" (judged by X day inactivity)

3 ACCEPTED SOLUTIONS

Accepted Solutions
Super User
Super User

Re: Advanced Email Subscriber Churn Rate - Taking into Account Inactivity and Cohorts

@charleshale

 

Hey Charlie, how about the approach below? I haven't tested it because your sample data has only one date for all entries but see if it helps, at least the logic. 

 

What I've done is add an additional step to filter out the users earlier classified as soft churners that at some point had "optout" or "hardbounce" in Table1[Engagement] . That's what I've understood you need. Correct me if I'm mistaken. We go over the list and check for each ID whether its values in Table1[Engagement] (returned by the CALCTABLE) include  "optout" or "hardbounce" (the intersect would be non-blank).

 

SoftChurnersCountExcludingHard =
VAR _RefDate =
    DATE ( 2019, 01, 20 ) //This can be set via a slicer as well
VAR _PeriodOfInactivity = 120 //Number of days. Change here or through a slicer
VAR _CutOffDate = _RefDate - 120
VAR _SoftChurnersList =
    FILTER (
        DISTINCT ( Table1[Profile ID] ),
        CALCULATE ( MAX ( Table1[Latest Interaction] ) ) < _CutOffDate
    )
VAR _SoftChurnersExcludingHard =
    FILTER (
        _SoftChurnersList,
        COUNTROWS (
            INTERSECT (
                CALCULATETABLE ( DISTINCT ( Table1[Engagement] ) ),
                { "optout", "hardbounce" }
            )
        )
            > 0
    )

RETURN COUNTROWS ( _SoftChurnersExcludingHard )

I've just taken the previous code as base and extended it so I am a bit worried about the performance of this approach. This might not be the most efficient but well, if your dataset is not very large it should be fine. Otherwise we could explore further.

 

 

         

Super User
Super User

Re: Advanced Email Subscriber Churn Rate - Taking into Account Inactivity and Cohorts

@charleshale

 

I haven't checked the rest of the measure but _SoftChurnersCount doesn't return what you expect because _CountSubs_NonBounceNonOptout is a variable.

Variables in DAX are immutable. Once you assign a value to a variable at declaration, its contents won't change. Ever. It's like a constant, a behavior different from what you get in other programming languages.

So your

   

CALCULATE( _CountSubs_NonBounceNonOptout, ...NoMatterWhatComesHere...)

will always return _CountSubs_NonBounceNonOptout. It's like if you did

    

CALCULATE( 7, WhateverFiltersGoHere)

you'll always get 7.

  _CountSubs_NonBounceNonOptout would need to be a measure for the pattern you are trying to apply to work, plus perhaps some minor modification to the code.

 

Cheers

Super User
Super User

Re: Advanced Email Subscriber Churn Rate - Taking into Account Inactivity and Cohorts

@charleshale

 

So taking into account the comments on the previous posting, I believe your measure could be simplified a bit:

 

\SoftChurnersTRY3_V2 = 
      CALCULATE( 
        COUNTROWS( Table1 ) , 
            FILTER (Table1, Table1[Latest Interaction] < EDATE(Table1[Profile Created Date],12) ),
            FILTER(Table1, Table1[Engagement] <> "optout" && Table1[Engagement] <> "hardbounce" )
            )

 

where we make use of the beautiful row context available within FILTER( ) Smiley Happy Note we're not using the two additional measures you created.

 

Or we could write the condition in the second FILTER( ) by using the IN operator (no conceptual change, just a different syntax to state the very same):

 

\SoftChurnersTRY3_V3 = 
      CALCULATE( 
        COUNTROWS( Table1 ) , 
            FILTER (Table1, Table1[Latest Interaction] < EDATE(Table1[Profile Created Date],12) ),
            FILTER(Table1, NOT Table1[Engagement] IN { "optout" , "hardbounce"} )
            )

 

There are still some additional (small) changes that could be done. And maybe some modification to increase performance, although that will depend to a substantial extent on the nature of your data.

 

By the way, I'm curious. Why do you start the names of your measures with a "\"? Is it to group them alphabetically so that they show all together? 

 

Cheers

11 REPLIES 11
Super User
Super User

Re: Advanced Email Subscriber Churn Rate - Taking into Account Inactivity and Cohorts

Hi @charleshale

 

Very interesting post Smiley Happy  It's always best to show your data in text/tabular form in addition to screen captures. It can then be readily copied and you make things easier for people trying to help.  

 

I ain't positive I've understood correctly but let's take it one step at a time and have a go at point C, which you claim to be most interested in.

If you want to identify and count the users that have not been active in the last 120 days you could try the following. For every ID, we calculate the latest date it appears on the table and filter out those that last appeared within the last 120 days.  The remaining ones are the soft churners.

 

SoftChurnersCount =
VAR _RefDate =
    DATE ( 2019, 01, 20 ) //This can be set via a slicer as well
VAR _PeriodOfInactivity = 120 //Number of days. Change here or through a slicer
VAR _CutOffDate = _RefDate - 120
VAR _SoftChurnersList =
    FILTER (
        DISTINCT ( Table1[Profile ID] ),
        CALCULATE ( MAX ( Table1[Latest Interaction] ) ) < _CutOffDate
    )
RETURN
    COUNTROWS ( _SoftChurnersList )

 

Does that make sense?

It is possible that some users that have not been active for the last 120 days did actually "hardbounce" or "optout" and thus are hard churners. If we want to consider that (I guess so) we'd need to do some changes.  Probably using some of the stuff you already have.

 

 

Code formatted with   www.daxformatter.com

 

 

charleshale Regular Visitor
Regular Visitor

Re: Advanced Email Subscriber Churn Rate - Taking into Account Inactivity and Cohorts

@AlB- Thanks for the response.  Your soft churner DAX is super elegant.  That's the sort of DAX I'm trying to teach myself.   How would you add an additional filter in the VAR = _SoftChurnersList to exclude in the count Engagement = optout || hardbounce because I believe those would otherwise be double counted.  Thank you.  Charlie

 

Profile IdProfile Created DateEngagementLatest Interaction
5c1c1022a41e4f0a3605618812/20/2018 16:56engaged12/20/2018 17:39
5c1c0f3a52ba1e7aa107c89c12/20/2018 16:52passive12/20/2018 16:56
5c1c0ec3576f2c39ab2e610c12/20/2018 16:50passive12/20/2018 16:54
5c1c0e7c20122e74b53ca11012/20/2018 16:49active12/20/2018 16:50
5c1c0cfe2a077c3d82788e5012/20/2018 16:43passive12/20/2018 16:43
5c1c0c85a777396a5406e79712/20/2018 16:41engaged12/20/2018 17:19
5c1c0c346c28c735f903386a12/20/2018 16:40passive12/20/2018 16:41
5c1c0b2a2a077c7e904107dd12/20/2018 16:35passive12/20/2018 16:37
5c1c09e47e553f20ad35a11912/20/2018 16:30active12/20/2018 16:31
5c1c09827e553f22590fc49a12/20/2018 16:28passive12/20/2018 16:41
5c1c090e20122e6e8d2f5b3512/20/2018 16:26passive12/20/2018 16:48
5c1c08aa67808976ec490fcd12/20/2018 16:24passive12/20/2018 16:38
5c1c081e20122e74b53c99f212/20/2018 16:22passive12/20/2018 16:38
5c1c07a620122e3a972edbdb12/20/2018 16:20passive12/20/2018 16:21
5c1c07376780896a331a025e12/20/2018 16:18engaged12/20/2018 16:24
5c1c06e16780896a331a021212/20/2018 16:17passive12/20/2018 16:32
5c1c0666283d8e6880185c4f12/20/2018 16:15passive12/20/2018 16:49
5c1c058a7e553f4b3b23495b12/20/2018 16:11engaged12/20/2018 16:13
5c1c05280564ce5ee53d022b12/20/2018 16:10optout12/20/2018 16:23
5c1c02d552ba1e050c3ac8cf12/20/2018 16:00passive12/20/2018 16:30
5c1c02ccbf0081454006b0a512/20/2018 15:59passive12/20/2018 16:50
5c1c0296576f2c4e7c69a81212/20/2018 15:59passive12/20/2018 16:34
5c1c0233283d8e771d503da012/20/2018 15:57engaged12/20/2018 16:29
5c1c00890564ce2b4f11458d12/20/2018 15:50active12/20/2018 16:56
5c1c008640f866295f71482412/20/2018 15:50passive12/20/2018 17:24
5c1c00237ace5a0c01162ced12/20/2018 15:48engaged12/20/2018 15:54
5c1bffdd67808925010f38a712/20/2018 15:47passive12/20/2018 17:24
5c1bff0e576f2c39ab2e5d8f12/20/2018 15:43engaged12/20/2018 17:00
5c1bfecbb90c2f589757944912/20/2018 15:42passive12/20/2018 15:48
5c1bfe7944d51f5d6943c08112/20/2018 15:41engaged12/20/2018 16:31
5c1bfe757ace5a45c90c554e12/20/2018 15:41passive12/20/2018 16:24
5c1bfcbea77739428b6770cf12/20/2018 15:34active12/20/2018 16:33
5c1bfbf26c28c70b0c17f26712/20/2018 15:30passive12/20/2018 15:41
5c1bfbcc52ba1e050c3ac2a912/20/2018 15:30passive12/20/2018 17:01
5c1bfb7d954fcf7e3169976412/20/2018 15:28passive12/20/2018 15:43
Super User
Super User

Re: Advanced Email Subscriber Churn Rate - Taking into Account Inactivity and Cohorts

@charleshale

 

Hey Charlie, how about the approach below? I haven't tested it because your sample data has only one date for all entries but see if it helps, at least the logic. 

 

What I've done is add an additional step to filter out the users earlier classified as soft churners that at some point had "optout" or "hardbounce" in Table1[Engagement] . That's what I've understood you need. Correct me if I'm mistaken. We go over the list and check for each ID whether its values in Table1[Engagement] (returned by the CALCTABLE) include  "optout" or "hardbounce" (the intersect would be non-blank).

 

SoftChurnersCountExcludingHard =
VAR _RefDate =
    DATE ( 2019, 01, 20 ) //This can be set via a slicer as well
VAR _PeriodOfInactivity = 120 //Number of days. Change here or through a slicer
VAR _CutOffDate = _RefDate - 120
VAR _SoftChurnersList =
    FILTER (
        DISTINCT ( Table1[Profile ID] ),
        CALCULATE ( MAX ( Table1[Latest Interaction] ) ) < _CutOffDate
    )
VAR _SoftChurnersExcludingHard =
    FILTER (
        _SoftChurnersList,
        COUNTROWS (
            INTERSECT (
                CALCULATETABLE ( DISTINCT ( Table1[Engagement] ) ),
                { "optout", "hardbounce" }
            )
        )
            > 0
    )

RETURN COUNTROWS ( _SoftChurnersExcludingHard )

I've just taken the previous code as base and extended it so I am a bit worried about the performance of this approach. This might not be the most efficient but well, if your dataset is not very large it should be fine. Otherwise we could explore further.

 

 

         

charleshale Regular Visitor
Regular Visitor

Re: Advanced Email Subscriber Churn Rate - Taking into Account Inactivity and Cohorts

@AlBYou are a DAX genius.   I've never event seen intersect in all the books I'm reading.   Going to try it out.   Will report back on this thread

charleshale Regular Visitor
Regular Visitor

Re: Advanced Email Subscriber Churn Rate - Taking into Account Inactivity and Cohorts

Update: for the this function to work with cohorts, the measurement dates have to be relative to the sign up dates.   The below code is getting me close to the right calcs.  
 
Technical question:  Why does the following "CALCULATE(....Table1[Latest Interaction] < _MeasureDate ") context that I'm using below fail to count right?   @AlB used FILTER ( Calculate (Max... , which clearly works, whereas  @avisinghand Rob collie's terrific book seems to suggest to my amateur eyes that I should be using the CALCULATE with a filter that I'm trying unsuccessfully below.
 
\SoftChurnersTRY2 = 
VAR _MeasureDate =
    DATEADD(Table1[Profile Created Date].[Date],1,YEAR)

VAR _CountSubs_NonBounceNonOptout = 
    CALCULATE( DISTINCTCOUNT(Table1[Profile Id]), 
        Table1[Engagement] = "passive" || Table1[Engagement] = "active"  || Table1[Engagement] = "engaged"  || Table1[Engagement] = "disengaged" 
    )

VAR _SoftChurnersCount =
    CALCULATE( _CountSubs_NonBounceNonOptout , Table1[Latest Interaction] < _MeasureDate ) 

RETURN
     _SoftChurnersCount
Super User
Super User

Re: Advanced Email Subscriber Churn Rate - Taking into Account Inactivity and Cohorts

@charleshale

 

I haven't checked the rest of the measure but _SoftChurnersCount doesn't return what you expect because _CountSubs_NonBounceNonOptout is a variable.

Variables in DAX are immutable. Once you assign a value to a variable at declaration, its contents won't change. Ever. It's like a constant, a behavior different from what you get in other programming languages.

So your

   

CALCULATE( _CountSubs_NonBounceNonOptout, ...NoMatterWhatComesHere...)

will always return _CountSubs_NonBounceNonOptout. It's like if you did

    

CALCULATE( 7, WhateverFiltersGoHere)

you'll always get 7.

  _CountSubs_NonBounceNonOptout would need to be a measure for the pattern you are trying to apply to work, plus perhaps some minor modification to the code.

 

Cheers

charleshale Regular Visitor
Regular Visitor

Re: Advanced Email Subscriber Churn Rate - Taking into Account Inactivity and Cohorts

I see.   Thank  you.  The following fixes that (note - I'm simplifying the problem down to 1 yr churn for the moment, which wouldnt work for any signs ups in the past year).    Any idea why the following date calc fails to work correctly?  I can provide more detail.

 

Seems like this should be easier to learn -- once upon a time, I ran modeling at an investment bank.....and these churn calcs seem way harder than arrays. 

 

 

\SoftChurnersTRY3 = 
      CALCULATE( 
        COUNTROWS( Table1 ) , 
            FILTER (Table1, Table1[Latest Interaction] < DATEADD(Table1[Profile Created Date].[Date],1,YEAR) ),
            FILTER(Table1, Table1[Engagement] <> "optout" || Table1[Engagement] <> "hardbounce" )
            )

 

 PS - I know it's something to do with the date code.  For QA purposes, I tried created the following measure:
 
\exp_dateadd = CALCULATE( MAXX( Table1, DATEADD(Table1[Profile Created Date].[Date],1,YEAR))
)
It almost worked.  Here was the raw data.
 

 

Profile IdProfile Created DateLatest InteractionEngagement
588574aa99964aa82b8b465e1/22/2017 0:0012/19/2018 0:00engaged
58a8730e72fbf246388b52fe2/18/2017 0:006/16/2017 0:00optout
59a6f24ab84a99f7048b4ca88/30/2017 0:0011/4/2017 0:00optout
59b743fbcbcf4894438b46439/11/2017 0:0011/3/2018 0:00disengaged
59f55c850aea1142238b57d110/29/2017 0:0012/15/2018 0:00active
5a380a5c99964ad07b8b557512/18/2017 0:0012/21/2017 0:00optout
5a935b0f566a94655d8b469d2/25/2018 0:003/26/2018 0:00optout
5b12ff24fbd297519555bf916/2/2018 0:0012/17/2018 0:00active
5b289103954fcf13d35865e46/19/2018 0:0012/19/2018 0:00engaged
5b68d59acff06b4227388bd48/6/2018 0:0012/12/2018 0:00passive
5b7ecfbd2ddf9c19e646eb558/23/2018 0:0012/20/2018 0:00active
5b81573ca7773959a6383bec8/25/2018 0:0012/18/2018 0:00optout
5bb5508db90c2f4faf5e453910/3/2018 0:0011/30/2018 0:00disengaged
5bde70c41708b1620960f41f11/4/2018 0:0012/20/2018 0:00engaged
5be26b7352ba1e69ba563b8c11/6/2018 0:0011/14/2018 0:00optout
5be28d3f954fcf310f5fcdc611/7/2018 0:0012/20/2018 0:00engaged
5c0055270564ce4a2220bc9311/29/2018 0:0012/17/2018 0:00

active

 

 

 
Here was the data with Maxx measure.   I got a year later on Profile Created Date except 2018 Profile Created Dates got stripped out....so there must be something wrong with both the plain DATEADD I tried first and using MAXX.   //Interim update ~5 hrs later - also trying edate - although killer on memory //
 
Profile IdProfile Created DateLatest InteractionEngagement\exp_dateadd
5a380a5c99964ad07b8b557512/18/2017 0:0012/21/2017 0:00optout12/18/2018 0:00
59f55c850aea1142238b57d110/29/2017 0:0012/15/2018 0:00active10/29/2018 0:00
59b743fbcbcf4894438b46439/11/2017 0:0011/3/2018 0:00disengaged9/11/2018 0:00
59a6f24ab84a99f7048b4ca88/30/2017 0:0011/4/2017 0:00optout8/30/2018 0:00
58a8730e72fbf246388b52fe2/18/2017 0:006/16/2017 0:00optout2/18/2018 0:00
588574aa99964aa82b8b465e1/22/2017 0:0012/19/2018 0:00engaged

1/22/2018 0:00

charleshale Regular Visitor
Regular Visitor

Re: Advanced Email Subscriber Churn Rate - Taking into Account Inactivity and Cohorts

So - I solved the problem.   Public service announcement:  DATEADD aint what it seems to be to my amateur eyes.

 

I used EDATE and it was processing intensive and cumbersome but it worked.  Here were the steps:

 

1. Create a measure out of the Profiled Created Date column.  Figuring this out hurt, but here's what worked:

 

\Profile Created Date = 
MAXX(
	KEEPFILTERS(VALUES('Table1'[Profile Id])),
	CALCULATE(MAX('Table1'[Profile Created Date])
))

2. Using edate to make a measure of the churn measurement date (in this case 1 year out): 

 

\_1YrChurnDate = EDATE([\Profile Created Date],12)

3. Writing the Soft Churn logic that started this whole adventure: 

 

\SoftChurnersTRY3 = 
      CALCULATE( 
        COUNTROWS( Table1 ) , 
            FILTER (Table1, Table1[Latest Interaction] < [\_1YrChurnDate] ),
            FILTER(Table1, Table1[Engagement] <> "optout" || Table1[Engagement] <> "hardbounce" )
            )

 

While surely there's an easier way to calculate soft churn in a cohort, and one that has variables, AT LEAST THIS WORKS.   It's a really important issue in email subscriber lists because hard churn (people that bounce and optout) is often ~20% compared to people that never do that but practically never show up again.

 

I'll keep refining the calcs and see where I can go....in the meantime, I'll leave this open because there have to be more elegant solutions than mine.

 

I find it really interesting there's so little on edate vs dateadd and how much better edate is for situations like churn -- because it probably means I'm still thinking of the problem through an excel lense, whereas there would be much easier with a native DAX perspective

Highlighted
Super User
Super User

Re: Advanced Email Subscriber Churn Rate - Taking into Account Inactivity and Cohorts

@charleshale

 

Cool Smiley Happy It's very interesting to see the number of things you've tried and the progress made.

There's quite a lot to be commented on (DATEADD, EDATE, the auto date/time capability...) but let's go step by step.

 

First of all, can you explain the new concept you are using to determine whether a user is a soft churner? Leaving the 'Engagement' value aside for now, you are searching for users that had their last interaction within 1 year of signing up. Why is this? Can you elaborate? 

 

Now moving on to the code you show:

 

1. Why the KEEPFILTERS in:

 

\Profile Created Date = 
MAXX(
	KEEPFILTERS(VALUES('Table1'[Profile Id])),
	CALCULATE(MAX('Table1'[Profile Created Date])
))

 What are you trying to do? I don't think it has any effect in this case but I might be missing something. Have a look at this article.

 

2. Isn't [\Profile Created Date] just an intricate way of getting to the value of 'Table1'[Profile Id]? You would get the same result if you used directly good old 'Table1'[Profile Id] instead of the measure.

 

3. The condition on the second filter of  

 

\SoftChurnersTRY3 = 
      CALCULATE( 
        COUNTROWS( Table1 ) , 
            FILTER (Table1, Table1[Latest Interaction] < [\_1YrChurnDate] ),
            FILTER(Table1, Table1[Engagement] <> "optout" || Table1[Engagement] <> "hardbounce" )
            )

 

should be

Table1[Engagement] <> "optout"  && Table1[Engagement] <> "hardbounce"

Otherwise, with an OR as you had, every row in your sample data will evaluate to TRUE, since none has "hardbounce" in 'Engagement'