Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
I want to know the retention rate for every group of customers. Customers are grouped based on their first-time order date.
Row header = First Month Order Date
Column header = Order Date
for example:
On Jan 2020, there are 5 customers who place an order for the first time (lets say, customer A B C D E). Then, on Feb 2020 the dax have to check whether this 5 customers place an order again or not. If true, then count the customer ID, which is 2 customers left (lets say, customer A and B). and so on until it hit the current date or until zero. I want this dax to only count when there is a 1 month difference order date (example: if they first order on january 2020, then order again on march 2020, it won't count as retention and considered as churn).
The problem is, my attempt still count customer ID after it hits zero (as you can see there is 1 customer in May 2020 column, it should be zero all the way to June 2020)
This is my attempt on dax:
==================================================
repeat_order =
VAR baseFilter =
FILTER
(
'orders',
'orders'[customer_id] = EARLIER('orders'[customer_id])
)
VAR selectDate =
CALCULATE
(
LASTDATE('orders'[created_at]),
baseFilter,
FILTER
(
baseFilter,
'orders'[created_at] < EARLIER('orders'[created_at])
)
)
VAR repeat =
MONTH('orders'[created_at])
-
CALCULATE
(
MIN('orders'[created_at].[MonthNo]),
baseFilter,
FILTER
(
baseFilter,
'orders'[created_at] = selectDate
)
)
RETURN
IF
(
'orders'[status] = -1
,
BLANK()
,
IF
(
'orders'[created_at] = CALCULATE(MIN('orders'[created_at]), FILTER(ALLEXCEPT('orders', 'orders'[customer_id]), ' orders'[status] <> -1))
,
"First Order"
,
IF
(
repeat = 1,
"Yes",
"No"
)
)
)
==============================================================
Cohort Repeat Order =
IF
(
MIN('customers'[Join Date].[MonthNo]) = MIN('orders'[created_at].[MonthNo])
,
CALCULATE
(
DISTINCTCOUNT('orders'[customer_id]),
'orders'[status] <> -1
)
,
CALCULATE
(
DISTINCTCOUNT('orders'[customer_id]),
'orders'[status] <> -1,
'orders'[repeat_order] = "Yes"
)
)
These type of analyses are tricky, and usually require making a second table to hold duplicate values (in your case for Months) to be used as the column for the matrix columns. A variation on the Basket Analysis pattern by SQLBI can also be used.
I made some sample data, and a second table with month from that to use as columns and came up with this measure that appears to do what you are looking for.
Remaining Customers =
VAR __mindate =
MIN ( Customers[Date] )
VAR __returnmonthmin =
MIN ( Dates2[Date] )
VAR __initcustomers =
FILTER (
VALUES ( Customers[Customer] ),
ISBLANK (
CALCULATE (
COUNTROWS ( Customers ),
ALLEXCEPT ( Customers, Customers[Customer] ),
Customers[Date] < __mindate
)
)
)
VAR __returncustomers =
CALCULATETABLE (
VALUES ( Customers[Customer] ),
ALL ( Customers ),
TREATAS ( VALUES ( Dates2[Month Name] ), Customers[Month Name] )
)
RETURN
COUNTROWS ( INTERSECT ( __initcustomers, __returncustomers ) )
If this works for you, please mark it as the solution. Kudos are appreciated too. Please let me know if not.
Regards,
Pat
To learn more about Power BI, follow me on Twitter or subscribe on YouTube.
Hi @mahoneypat
i tried to use your solution but it gives me this
more information: I have fact table orders and dimension customers connected. orders[created_at] is order date and customers[join_date] is first-time order date
I forgot to mention that there should be no relationship between the two tables. Is that the case in your model?
Regards,
Pat
To learn more about Power BI, follow me on Twitter or subscribe on YouTube.
Hi @mahoneypat
Yes, there is a relationship between fact and dimension, I need it for another visualization. Do you have any suggestions?
Sorry. I mean there should be no relationship to the new table I suggested you make with just the Month values.
Regards,
Pat
To learn more about Power BI, follow me on Twitter or subscribe on YouTube.
Hi @mahoneypat
I deactivated the relation between the new date table and customer table, but the result still the same as the last pic I sent.
I suspect the __returncustomers variable is returning blank. Can you return the COUNTROWS of that to confirm? Also, are you able to send a link to your pbix so I can troubleshoot it directly? If not, can you send a pic of your diagram view (tables and relationships)?
Regards,
Pat
To learn more about Power BI, follow me on Twitter or subscribe on YouTube.
Hi @mahoneypat
Yes, the countrows returns blank.
I can give you the data
https://drive.google.com/drive/folders/1j6qcHhBQdEUWsTXjKOWFHc4B2ohIPO-b?usp=sharing
(The data contains orders from 2019-2020, but i only need monthly retention for each year, either 2019 or 2020)
and the diagram looks like this
Ok. I loaded your csv files and did the following:
1. Made a Date tables with Date1 = DISTINCT(Orders[created_at]) and added Month and Month Name columns (month number to sort Month Name)
2. Made a 2nd Date table with Date2 = Date1
3. Related the tables as shown here
4. Used this updated measure
Remaining Customers =
VAR __mindate =
MIN ( Date1[created_at] )
VAR __initcustomers =
FILTER (
VALUES ( Orders[customer_id] ),
ISBLANK (
CALCULATE ( COUNTROWS ( Orders ), ALL ( Date1 ), Date1[created_at] < __mindate )
)
)
VAR __returncustomers =
CALCULATETABLE (
VALUES ( Orders[customer_id] ),
ALL ( Date1 ),
TREATAS ( VALUES ( Date2[created_at] ), Date1[created_at] )
)
RETURN
COUNTROWS ( INTERSECT ( __initcustomers, __returncustomers ) )
5. To get this result
If this works for you, please mark it as the solution. Kudos are appreciated too. Please let me know if not.
Regards,
Pat
To learn more about Power BI, follow me on Twitter or subscribe on YouTube.
Hi @mahoneypat
I did all the steps you have given me, also triple checked the steps. But it still doesn't work the same as your pbix.
For May, it should be: 0 4 3 1 1 1 0 0 0
(nb: for every month, it should take list of customer id from previous month, then intersect it with that particular month)
Do you think i missed some steps or something?
Here is a link to my pbix, so you can see what is different. Please let me know what you find.
https://drive.google.com/file/d/1a0wYJZp5lcPAs_VufGVMgPhx5i0kW7kd/view?usp=sharing
Regards,
Pat
To learn more about Power BI, follow me on Twitter or subscribe on YouTube.
Hi @mahoneypat
I just realized something. The data in your pbix only include Apr-Oct 2019 data, that's why it looks different. And it still doesn't calculate the way I want it.
I know that the logic should be like this:
Example: 5 month order date
A B C D E
A intersect B, then
(A⋂B)⋂C, then
(A⋂B⋂C)⋂D, then for example, this line calculate to zero, E = automatically zero.
I have trouble converting this logic to dax. Sorry for the vague explanation, I hope you can understand me
I used the csv files you provided, so I couldn't check if it worked for customers that missed a month. In any cases, here is another expression that I believe does what you are looking for. If not, please send a link to the full orders.csv (or show the correct output expected from the file you did send).
This expression filters the returncustomers to only those that made purchases in all of the months in between the Date1 and Date2 month values. It is a tricky expression (and I really hope it works now).
Remaining Customers2 =
VAR __mindate =
MIN ( Date1[created_at] )
VAR __maxdate2 =
MAX ( Date2[created_at] )
VAR __monthvalues =
CALCULATETABLE (
VALUES ( Date1[Month] ),
ALL ( Date1 ),
Date1[created_at] >= __mindate
&& Date1[created_at] <= __maxdate2
)
VAR __monthcount =
COUNTROWS ( __monthvalues )
VAR __initcustomers =
FILTER (
VALUES ( Orders[customer_id] ),
ISBLANK (
CALCULATE ( COUNTROWS ( Orders ), ALL ( Date1 ), Date1[created_at] < __mindate )
)
)
VAR __returncustomers =
CALCULATETABLE (
VALUES ( Orders[customer_id] ),
ALL ( Date1 ),
TREATAS ( VALUES ( Date2[created_at] ), Date1[created_at] )
)
VAR __returncustomersallmonths =
FILTER (
__returncustomers,
CALCULATE (
COUNTROWS (
FILTER ( __monthvalues, NOT ( ISBLANK ( CALCULATE ( COUNTROWS ( Orders ) ) ) ) )
),
ALL ( Date1 )
) = __monthcount
)
RETURN
COUNTROWS (
INTERSECT ( __initcustomers, __returncustomersallmonths )
)
If this works for you, please mark it as the solution. Kudos are appreciated too. Please let me know if not.
Regards,
Pat
To learn more about Power BI, follow me on Twitter or subscribe on YouTube.
Hi @mahoneypat
Sorry, I just realized i didn't give you the full data. Also, i tried the newest formula but still it doesn't work like intended. So i decided to tweak your formula and i did it!
tldr; i made new column to identify returning customers and customers who missed a month. Then the formula counts whoever returned + doesn't miss.
Thanks for your help! Sorry, i can't accept your formula as the solution (to avoid people getting the wrong idea), but kudos given!
Thank you for all the kudos. Glad you got it figured out.
Regards,
Pat
To learn more about Power BI, follow me on Twitter or subscribe on YouTube.
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
49 | |
26 | |
20 | |
15 | |
12 |
User | Count |
---|---|
57 | |
49 | |
43 | |
19 | |
18 |