Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi everyone,
Since a few months I have been exploring the wonderful word of PowerBI. I am getting the hang of the basics and definitely want to grow in using it. So far I have been managing all of my desired reports, calculations and dax-formula’s but now it’s the first time I am a bit stuck.
I am working at a start-up that does online local grocery shopping including delivery. As you can imagine we want to know how often and recent customers are placing order. Therefore we have started working with the RFM model (recency, frequency and monetary values) and are very happy with that.
So for recency I lookup how many days it has been since their last order and for frequency I check how many orders are placed in the last 3 months. This we want to know for every customer to aggregate how their behaviour is changing and we segment them in groups. E.g. a customer that ordered a lot and recent is a loyal/high value customer.
So far it wasn’t that difficult to calculate the above for the current point in time. I made a customer table with aggregated data.
But now we want to see these segmentation over-time. So basically I want to know the segmentation for every customer for every day since our start in August. Now I have created a new joined table based on date and customer ID’s but that is 2000 customers * 180 days = 360.000 rows already (and growing). Highly inefficient and it’s slowing down my model. I am doing manual calculations per data point as well. I will share them below. I am pretty sure this is not the way to go but I have no clue where to start. Hoping that some of you can help me out with the right line of thought, dax-code or some videos with instructions.
Aantal bestellingen = CALCULATE(COUNTROWS(LOK_Order),
FILTER(LOK_Order,BI_RFM_History[customer_email]=LOK_Order[customer_email]),
FILTER(LOK_Order,LOK_Order[delivery_date]<=BI_RFM_History[Date]))
Aantal dagen sinds laatste bestelling = IF(BI_RFM_History[Date]>TODAY(),BLANK(),DATEDIFF(
CALCULATE(MAXX(LOK_Order,LOK_Order[delivery_date]),
FILTER(LOK_Order,LOK_Order[customer_email]=BI_RFM_History[customer_email]),
FILTER(LOK_Order,LOK_Order[delivery_date]<BI_RFM_History[Date])),BI_RFM_History[Date],DAY))
Date-3M = DATEADD(BI_RFM_History[Date],-3,MONTH)
Aantal bestellingen 3M = IF(BI_RFM_History[Date]>TODAY(),BLANK(),CALCULATE(COUNTROWS(LOK_Order),
FILTER(LOK_Order,BI_RFM_History[customer_email]=LOK_Order[customer_email]),
FILTER(LOK_Order,LOK_Order[delivery_date]<=BI_RFM_History[Date]),
FILTER(LOK_Order,LOK_Order[delivery_date]>=BI_RFM_History[Date-3M])))
Frequency history = IF(BI_RFM_History[Date]>TODAY(),BLANK(),CALCULATE(VALUES(BI_RFM_CONDITIONS[Ranking]),
FILTER(BI_RFM_CONDITIONS,BI_RFM_History[Aantal bestellingen 3M]>=BI_RFM_CONDITIONS[Min_Freq_3M]),
FILTER(BI_RFM_CONDITIONS,BI_RFM_History[Aantal bestellingen 3M]<=BI_RFM_CONDITIONS[Max_Freq_3M])))
Recency_history = IF(BI_RFM_History[Date]>TODAY(),BLANK(),CALCULATE(VALUES(BI_RFM_CONDITIONS[Ranking]),
FILTER(BI_RFM_CONDITIONS,BI_RFM_History[Aantal dagen sinds laatste bestelling]>=BI_RFM_CONDITIONS[Min_recency_days]),
FILTER(BI_RFM_CONDITIONS,BI_RFM_History[Aantal dagen sinds laatste bestelling]<=BI_RFM_CONDITIONS[Max_recency_days])))
Segment_history = IF(BI_RFM_History[Aantal bestellingen]=0,BLANK(),
CALCULATE(VALUES(BI_RFM_Ranking[Benoeming]),
FILTER(BI_RFM_Ranking,BI_RFM_History[Frequency history]=BI_RFM_Ranking[Frequency]),
FILTER(BI_RFM_Ranking,BI_RFM_History[Recency_history]=BI_RFM_Ranking[Recency])))
Obviously not the right way to go… I know I have to user either measures or variables somewhere. Eventuelly this is where I want to go but then in an efficient way:
Hopefully you guys can help out!
This sounds like really nice showcase for data analysis. Congratulations on what you have achieved so far.
Next you will indeed want to dive into measures, and the difference between row context and filter context. For example CALCULATE(MAXX()) may not do what you think it does. Usually you only use one or the other but not both together.
Please provide sample data in usable format (not as a picture) and show the expected numerical outcome.
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
115 | |
100 | |
90 | |
68 | |
61 |
User | Count |
---|---|
150 | |
120 | |
100 | |
87 | |
68 |