Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
charleshale
Responsive Resident
Responsive Resident

Advanced Email Subscriber Churn Rate - Taking into Account Inactivity and Cohorts

Hi.  While I've mastered most of the basics in DAX and M Query for churn, there's a scenario that I find quite interesting that I wanted to see if people had feedback for.

 

Consider a churn analysis for a typical large (30m monthly unique visitor) publisher's email subscriber list - snapshot below:

 

image.png

 

Here's what these columns represent and why they should be all that's needed to determine churn:

 

1. Profile ID represents a unique subscriber (in this case it's just a hashed email address).  For DAX churn calcs, we'll basically be counting these under various contexts.

2. Profile Created Date in this instance determines cohort since this is the date someone joins.   What I'm going to suggest below is a churn metric that uses a median of the churn rate across certain cohorts.

3. Engagement status.   The most basic level of churn happens when someone goes (i) "hardbounce" or (ii) "optout".  I call this "Hard" Churn.   The Latest Interaction date captures the date of this event.

4.  (and here's the hardest part) Latest Interaction.   Most subscribers in these email lists never hardbounce or optout and yet need to be counted as what I call "Soft" Churn because they disengage to the point where don't visit and need to be counted as churn.    

 

Accordingly, does anyone have any suggestions on how I'd write a DAX query to get the following: 

 

A.  Median 28 Day "Hard" Churn for the Prior 12 Months:  taking into memory for each of the last 12 fully completed monthly cohorts (ie since it's now Jan 2019, we would look at the cohorts of users who signed up during 2018-01, 2018-02 .... 2018-12), calculating the percent of Profile ID's that "hard churn" -- defined as "hardbounce" and "optout" -- in each cohort, and from that creating a measure of the median of  those 12 churn percentages.

 

(note I've done this -- harder part follows below)

 

B.  Setting up the 28 day period as a variable or parameter so that we can look at churn over any number of days while automatically counting only the last 12 periods that are fully completed given that number of days.  In other words, let's say we want to change 28 Day Hard Churn to 90 Day Hard Churn, the 12 cohorts analyzed need to be 90 days earlier (2017-09, 2017-10 .... 2018-09) because a cohort will only be measurable for churn purposes if its last day is older than the # of days history being looked at.  


(and here's the hardest part)

C. Adding "Soft" Churn" to #1 and #2, above.     "Soft" Churn is someone who should be counted as having churned because they havent visited the site in over X days (let's use 120 days for now - although ideally this is Parameter).  This is the part that has made my brain hurt the most because the DAX needs (I think) to draw a variable based the difference between a fixed date that the numbers were drawn (ie 1/20/2019) and Latest Interaction to see if the Profile ID should be deemed churned due to inactivity.  I have done this manually through tremendous amounts of calculated columns.  Do any DAX geniuses know how to do this with measures from just the 4 primary columns, above?

 

Note: since anyone interested in this is probably interested in what we see as churn, where's what the numbers look like for us (based on my more manual and more M-Query calcs):   see chart below: we typically see hard churn ~16% in the first month of any cohort, then an additional 6% in month 2, then about another 20% over the rest of the year for ~40% in the first year if we count 6 month inactives as having churned.   6 month inactives come back only about 5% of the time.

 

Annotation 2019-01-28 151142.jpg\

If I were super-skilled DAX, ideally I'd be able to show a stacked area chart for the median of any given set of cohorts where the Y axis is % churn, the X axis is time, and the stacked areas are "hard churn" and "soft churn" (judged by X day inactivity)

3 ACCEPTED SOLUTIONS

@charleshale

 

Hey Charlie, how about the approach below? I haven't tested it because your sample data has only one date for all entries but see if it helps, at least the logic. 

 

What I've done is add an additional step to filter out the users earlier classified as soft churners that at some point had "optout" or "hardbounce" in Table1[Engagement] . That's what I've understood you need. Correct me if I'm mistaken. We go over the list and check for each ID whether its values in Table1[Engagement] (returned by the CALCTABLE) include  "optout" or "hardbounce" (the intersect would be non-blank).

 

SoftChurnersCountExcludingHard =
VAR _RefDate =
    DATE ( 2019, 01, 20 ) //This can be set via a slicer as well
VAR _PeriodOfInactivity = 120 //Number of days. Change here or through a slicer
VAR _CutOffDate = _RefDate - 120
VAR _SoftChurnersList =
    FILTER (
        DISTINCT ( Table1[Profile ID] ),
        CALCULATE ( MAX ( Table1[Latest Interaction] ) ) < _CutOffDate
    )
VAR _SoftChurnersExcludingHard =
    FILTER (
        _SoftChurnersList,
        COUNTROWS (
            INTERSECT (
                CALCULATETABLE ( DISTINCT ( Table1[Engagement] ) ),
                { "optout", "hardbounce" }
            )
        )
            > 0
    )

RETURN COUNTROWS ( _SoftChurnersExcludingHard )

I've just taken the previous code as base and extended it so I am a bit worried about the performance of this approach. This might not be the most efficient but well, if your dataset is not very large it should be fine. Otherwise we could explore further.

 

 

         

View solution in original post

@charleshale

 

I haven't checked the rest of the measure but _SoftChurnersCount doesn't return what you expect because _CountSubs_NonBounceNonOptout is a variable.

Variables in DAX are immutable. Once you assign a value to a variable at declaration, its contents won't change. Ever. It's like a constant, a behavior different from what you get in other programming languages.

So your

   

CALCULATE( _CountSubs_NonBounceNonOptout, ...NoMatterWhatComesHere...)

will always return _CountSubs_NonBounceNonOptout. It's like if you did

    

CALCULATE( 7, WhateverFiltersGoHere)

you'll always get 7.

  _CountSubs_NonBounceNonOptout would need to be a measure for the pattern you are trying to apply to work, plus perhaps some minor modification to the code.

 

Cheers

View solution in original post

@charleshale

 

So taking into account the comments on the previous posting, I believe your measure could be simplified a bit:

 

\SoftChurnersTRY3_V2 = 
      CALCULATE( 
        COUNTROWS( Table1 ) , 
            FILTER (Table1, Table1[Latest Interaction] < EDATE(Table1[Profile Created Date],12) ),
            FILTER(Table1, Table1[Engagement] <> "optout" && Table1[Engagement] <> "hardbounce" )
            )

 

where we make use of the beautiful row context available within FILTER( ) Smiley Happy Note we're not using the two additional measures you created.

 

Or we could write the condition in the second FILTER( ) by using the IN operator (no conceptual change, just a different syntax to state the very same):

 

\SoftChurnersTRY3_V3 = 
      CALCULATE( 
        COUNTROWS( Table1 ) , 
            FILTER (Table1, Table1[Latest Interaction] < EDATE(Table1[Profile Created Date],12) ),
            FILTER(Table1, NOT Table1[Engagement] IN { "optout" , "hardbounce"} )
            )

 

There are still some additional (small) changes that could be done. And maybe some modification to increase performance, although that will depend to a substantial extent on the nature of your data.

 

By the way, I'm curious. Why do you start the names of your measures with a "\"? Is it to group them alphabetically so that they show all together? 

 

Cheers

View solution in original post

11 REPLIES 11

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.