Calculating Recency with Auto-Updating Sales Data

_Ester_ · ‎05-19-2024

Hi everyone!

I need some assistance with a Power BI project related to calculating "Recency" from our sales data. Recency is a metric that measures the number of days since a customer's last purchase.

🔲 Here's the context:

I have a dataset named "Sales" which contains customer transaction records. Each row represents a single transaction with the following columns:

date: The date when the purchase occurred
customer_id: A unique identifier for the customer
total_sales: The amount spent by the customer in that transaction

🔲 Important Notes:

The date can repeat if multiple transactions occurred on the same day.
The customer_id can repeat if the same customer made multiple purchases over time.
The Sales database updates monthly.

🔲 Objective:
I want to calculate the "Recency" for each customer for each month of the year (or years) automatically, ensuring the calculation adapts as the Sales database updates. This should be a running recency, which is the number of days since the last purchase for each customer.

🔲 Specific Requirements:

The recency calculation should include all customers up to the month being analyzed, regardless of whether they made a purchase in that specific month.
I need to visualize this Recency data in a chart. For example, the chart could show the percentage or count of customers with a specific recency: how many customers have a recency of X days, how many have a recency of Y days, etc.

🔲 Example: If looking at the data for February 2024, I want to know the recency for all customers up to that point, including those who did not make any purchases in February 2024, either as a count or as a percentage of the total customers up to that month.

👉 You can find a PBIX and a Sales data sample here.

Thank you for your help!

_Ester_ · ‎05-20-2024

Hi @lbendlin ,

The complete project is a bit more complex, and I tried to simplify it here.
What I need is to perform an RFM analysis to segment customers based on three parameters:

Recency (days since the last purchase),
Frequency (purchase frequency),
Monetary (spending).

Then, I segment customers based on the combinations of recency, frequency, and monetary. So far, I have no issues with this part.

Each RFM segment is then targeted with specific email marketing strategies, for example, aiming to increase a customer's spending, to encourage more frequent purchases, or to decrease their recency (or a combination of these goals).

👉What insights/actions should come out of it?
I need to understand if my email marketing strategies are working. To do this, I want to conduct a customer migration analysis to understand how many (or what percentage) of customers move from one segment to another.
For example, if the "TOP" customer category has not increased, it might indicate that my email marketing strategy is not working and needs to be adjusted.

Here are the specific details of the categories for a concrete example (I've no issues with this part):

Each customer is given a label based on their Recency, Frequency, and Monetary values.

Recency categories:

Months passed since the previous transaction (up to the current date) ≤ 1 month ago: "HOT"
Months passed since the previous transaction (up to the current date) from 1 to 6 months ago: "RECENT"
Months passed since the previous transaction (up to the current date) from 6 to 12 months ago: "INACTIVE"
Months passed since the previous transaction (up to the current date) > 12 months ago: "HIBERNATING"

Frequency categories:

Number of purchases up to the current date ≤ 1: "NEW"
Number of purchases up to the current date between 2 and 3: "OCCASIONAL"
Number of purchases up to the current date between 4 and 7: "FREQUENT"
Number of purchases up to the current date > 7: "LOYAL"

Monetary categories:

Total sales up to the current date ≤ 50 €, then: "COPPER"
Total sales up to the current date between 50 and 100 €, then: "BRONZE"
Total sales up to the current date between 100 and 200 €, then: "SILVER"
Total sales up to the current date > 200 €, then: "GOLD"

Then, there are RFM categories as combinations of recency, frequency, and monetary:

NEW:

Recency: < 6 months
AND Frequency: = 1 purchase
(Any Monetary)

SLEEPERS:

Recency: between 6 and 12 months ago
Any Frequency
Any Monetary

HABITUAL:

Recency: < 6 months
AND Frequency: > 2 purchases
AND Monetary: between 50 and 200 euros

TOP:

Recency: < 6 months
AND Frequency: > 2 purchases
AND Monetary: > 200 euros

So far, so good: I have no issues segmenting customers based on their current status; the real challenge arises when I want to conduct a customer migration analysis to find out HOW MANY customers move from one segment to another over time.

Thank you again for your help!

lbendlin · ‎05-20-2024

Thank you for the details, much appreciated. So i guess you have to accept the slight inconsistencies that come with the monthly bucketing. For Segment moves I would propose to use a ribbon chart, or a Sankey chart (that one requires additional data prep). To calculate the data you will have to materialize the RFM parameters for each customer and each month (either in a calculated table or through a measure), and then you can plot them accordingly. You will need to disconnect your Calendar table.

I'll have a look at your sample PBIX.

lbendlin · ‎05-20-2024

Here's how a sample approach for recency for a single customer.

_Ester_ · ‎05-23-2024

Hi @lbendlin,
I tried to replicate your approach below:

@lbendlin wrote:
Here's how a sample approach for recency for a single customer.

I created the measures to calculate Recency as you showed me and I also added the measures for Frequency and Monetary:

It works! However, I can't understand how from here I can visualize the customer migration, for example with a ribbon chart.

I suppose I should somehow assign each customer_id a category based on their current condition, and then count how many customer_ids are in that category, but I can't find a way to do this with the calculated measures.
I tried to create a calculated measure for the RFM segments, but it doesn't work.

Could you guide me?
I updated the PBIX file with these new measures.

Again, thank you!

lbendlin

If you still want to avoid using a calculated table the next step is to prepare your independent dimensions (dates and customers) and then calculate the measures for each of the periods.

Without that the charts may not give you much

_Ester_

Hi @lbendlin,
I still can't understand your solution 😞

In the meantime, I developed the idea you suggested, adapting it as much as possible to what I need.

What's changed:

🔲 I have added measures:
Recency2, Frequency2, Monetary2, and RFM segments2 which calculate the status of each customer in EVERY period of the year, including periods where no transactions were made.

🔲 I have added a RFM categories table with the 5 possible categories ("NEW", "FROZEN", etc.) and the "Unique customers per segment" measure which calculates the number of users in each RFM category for each period.

✅ What I achieved:
I can know the size of each RFM segment over time, which can be an indication of how my business is performing.

Look at the visual below:
each column represents 100% of customers up to that date.

❌ What I didn't achieve:
A segment may have grown not because of my marketing strategies, but simply because the total number of customers has grown.
Additionally, I still don't know which segment the customers came from.

For example:
If the HABITUAL segment has grown, it is positive if the additional HABITUAL customers were previously TOP AT RISK customers, it is not positive if they were previously TOP customers.

@lbendlin do you think it is possible to start from what I have done and somehow understand the segment each customer belonged to in the previous period?

I have updated the PBIX with the new measures and visualizations (page: "RFM_Analysis_2" and "Migration")

Thanks again!

_Ester_ · ‎05-20-2024

Hi @lbendlin ,

Thank you for your suggestions! I tried both the Ribbon chart and the Sankey chart while attempting to avoid creating a calculated table with every combination month-customer_id because I believe that in the long run, it could significantly slow down the system.
Since the Sales file updates over time, the combinations of month and customer_id could become extremely large.

Is there a way to achieve the same goal without slowing down the system and ensuring everything is automatic?

Is there a way to create the calendar with all month-customer_id combinations directly with a calculated measure and then use these data to create the visual?
Is there a way to ensure that the table created with the measure and the visual update automatically over time?

Thank you again for your help!

lbendlin · ‎05-22-2024

Is there a way to create the calendar with all month-customer_id combinations directly with a calculated measure and then use these data to create the visual?

Measures return scalar values. They cannot hold tables.

Is there a way to ensure that the table created with the measure and the visual update automatically over time?

Not sure what you are asking. Assuming your data source is in Import mode this will be taken care of during the semantic model refresh.

_Ester_ · ‎05-23-2024

@lbendlin sorry, I think I just now understood the solution you proposed.
Let me take a look to make sure I got it right, and I'll get back to you.

Thanks in the meantime!

lbendlin · ‎05-19-2024

I want to calculate the "Recency" for each customer for each month of the year (or years)

This is generally considered a fallacy, as you have not specified where in the current month you have a transaction. A better approach will be to say "number of days between first transaction in current period and the penultimate transaction".

_Ester_ · ‎05-20-2024

Hi @lbendlin ,

Thank you for your input. I understand your point, but I believe there might be a misunderstanding.

In this context, "Recency" refers to the number of days since the last purchase made by each customer, not the number of days between the first and penultimate transactions. This metric is commonly used in customer behavior analysis to determine how recently a customer has engaged with the business.

▶️My goal is to calculate this "Recency" for each customer for each month, considering all their transactions up to that month.
⚫This means if a customer made their last purchase on January 15, 2024, the recency for this customer when looking at the data at the end of January 2024 will be 16 days; if looking at the data at the end of February 2024, the recency will be 46 days.

Could you please help me with a way to calculate this in Power BI?

Thank you!

lbendlin · ‎05-20-2024

I didn't say "first". I said "first in selected period".

The period aggregation is the problem. Instead, treat each customer individually.

_Ester_ · ‎05-20-2024

Hi @lbendlin ,

I'm not sure I understand what you are saying.
To clarify, my objective is to calculate the "Recency" for each customer individually, considering all their transactions up to the end of the selected month. For each customer, I want to determine the number of days since their last purchase as of the end of each month.

My ultimate goal is to understand if my marketing strategies are working.
For instance, if 80% of my customers have a recency of 90 days this month (meaning 80% last purchased 3 months ago) and I implement a marketing strategy to reduce recency, I want to see if the percentage of customers with a recency of 90 days has decreased after 2 months.
This would help me determine if the strategy was effective.
Therefore, I need to compare the previous period with the current period to see if there has been an improvement. Ideally, I would like to understand if there has been an improvement over multiple months.

Could you please explain more clearly what you mean by "period aggregation" and how it relates to my objective?

Thank you for your help!

lbendlin · ‎05-20-2024

Let's assume you are looking at May data.

Customer A had their first transaction on May 1st

Customer B had their first transaction on May 20th

Both customers had their penultimate transaction on April 30th.

What's the recency for either customer?

_Ester_ · ‎05-20-2024

Hi @lbendlin ,

Thank you for the example. I calculate recency as the time from the last transaction to the last date of analysis (in this case, May 31).

So, for Customer A:

If the transaction on May 1st is the last one in May, the recency is 31−1=30 days.

For Customer B:

Similarly, if the transaction on May 20th is the last one in May, the recency is 31−20=11 days.

From what I understand, you are suggesting calculating the recency as the number of days between the penultimate purchase and the first purchase of the selected period. However, this would not be a correct calculation of recency and could lead to misleading results.

Example:

Customer C:
First purchase of the period: May 15th
Penultimate purchase: January 1st

Using your suggested method, the recency would be 135 days, whereas calculating it as I suggest, it would be 16 days.

With the first calculation method, I would think that this is a dormant customer who needs to be reactivated, while with the second calculation method, this is a customer I consider active, which they actually are.

Could you please confirm if my interpretation of your suggestion is correct?

Thank you for your help!

lbendlin · ‎05-20-2024

How is it a dormant customer when they had transactions in the "current" period?

What are you ultimately trying to achieve with the report? What insights/actions should come out of it?

Calculating Recency with Auto-Updating Sales Data

Helpful resources

New forum boards available in Real-Time Intelligence.

Power BI Monthly Update - May 2024

How to Get Your Question Answered Quickly

Jumpstart your career with the Fabric Career Hub

Calculating Recency with Auto-Updating Sales Data

Helpful resources

New forum boards available in Real-Time Intelligence.

Power BI Monthly Update - May 2024

How to Get Your Question Answered Quickly