Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hello Folks,
I am new to PowerBI & Dax, please bear with me for any silly question.
I am trying to calculate Distinct active customers for each month.
I have customer table with Customer key, join date key amd term date key. I have date dimension table too.
Customer Table
Customer key join date key Term date
1. 1/1/2017 5/31/2017
2. 3/1/2017 3/31/2017
3. 3/1/2017 5/31/2017
my monthly active customer count should look like
Month Count
Jan 1
Feb 1
March. 3. since all customers are active
April. 2. since 1 and 3 are active
May. 2 since 1 and 3 are active
Any help or guidance is greatly appreciated.
Update: Customer table has around 65 million records, above is just the sample.
Thank you.
Solved! Go to Solution.
@sqlguru448 Hi
To get to the solution I did 2 things
Here is how the relationships are setup between - Customers Table and Cal Table
Both Join Date and Term Date have been related to Date in the Cal Table. Since 2 date columns are related, Customer[Join Date] with Cal[Date] is currently active
Here are the 3 Measures
1. Cumulative Distinct Count Until this Month = TOTALYTD(DISTINCTCOUNT([Customer Key]),Cal[Date]) It will give you the distinct cumulative count of customers until the current month
2. Terminations Until Previous Month = TOTALYTD(CALCULATE(DISTINCTCOUNT(Customer[Customer Key]),USERELATIONSHIP(Cal[Date],Customer[Term Date]),DATEADD(Cal[Date],-1,MONTH)),Cal[Date]) It will give the cumulative count of people who left until last month
3. Customers Active =[Cumulative Distinct Count Until this Month]-[Terminations Until Previous Month] the difference of both these will give you the number of customers left over for the current month
Here is the snapshot of the result
You can download the Excel PowerPivot File
Hi,
You may refer to my solution here.
Hope this helps.
Thank you @danextian @Ashish_Mathur and @ChandeepChhabra for your inputs which helped to arrive at my solution, I was able to do it by creating calculated Month table by using below dax and performance was also fine.
Month:=FILTER(
GROUPBY('d_date',d_date[Year_Month],d_date[Month_Abbr],d_date[Month_Number],"YM",MAXX(CURRENTGROUP(),VALUE(right('d_Date'[Year_Month],2))),"Start",Minx(CURRENTGROUP(),'d_date'[Calendar_Date]),"End_Date",MAXX(CURRENTGROUP(),'d_date'[Calendar_Date])
)
,AND('d_date'[Year_Month] >=201601 ,'d_date'[Year_Month]<=201712)
) Order By 'd_date'[Year_Month]
and then I created calculated column to count active member by month by writing below dax
= CALCULATE(
DISTINCTCOUNT('customer'[Customer_Key]),
FILTER('Customer',
AND(Customer[JoinDate]<='Dates'[End_Date] ,
Customer[TermDate] >='Dates'[Start] )
)
)
Thank you for your help
Hi,
You may refer to my solution here.
Hope this helps.
Hi Ashish.
May I know how did you manage to list "Customer Key" and "Date" (the range between the start date and the end date) under data table. New to power bi. Apologies if that has to be self-understood. Thanks!
Hi,
I managed to create a range of dates from just the beginning and end date inputs using the M language of the Query Editor. Please review the steps in the Query Editor to understand better.
Thank you for your response, Sorry if I haven't mentioned earlier. There are around 65 million records in Customer table and Cross join of Customer and Date will have performance issue.
Proud to be a Super User!
Thank you @danextian @Ashish_Mathur and @ChandeepChhabra for your inputs which helped to arrive at my solution, I was able to do it by creating calculated Month table by using below dax and performance was also fine.
Month:=FILTER(
GROUPBY('d_date',d_date[Year_Month],d_date[Month_Abbr],d_date[Month_Number],"YM",MAXX(CURRENTGROUP(),VALUE(right('d_Date'[Year_Month],2))),"Start",Minx(CURRENTGROUP(),'d_date'[Calendar_Date]),"End_Date",MAXX(CURRENTGROUP(),'d_date'[Calendar_Date])
)
,AND('d_date'[Year_Month] >=201601 ,'d_date'[Year_Month]<=201712)
) Order By 'd_date'[Year_Month]
and then I created calculated column to count active member by month by writing below dax
= CALCULATE(
DISTINCTCOUNT('customer'[Customer_Key]),
FILTER('Customer',
AND(Customer[JoinDate]<='Dates'[End_Date] ,
Customer[TermDate] >='Dates'[Start] )
)
)
Thank you for your help
Hi Ashsih,
Quick question, why would you make a separate month order table? How does it help ?
Hi,
To sort the Months in the visual by the order of the Calendar months.
I've been focusing more on M than on DAX in the recent months as my most of my data are very unstructured.
It's nice to be learning different approaches to the same problem.
Proud to be a Super User!
@sqlguru448 Hi
To get to the solution I did 2 things
Here is how the relationships are setup between - Customers Table and Cal Table
Both Join Date and Term Date have been related to Date in the Cal Table. Since 2 date columns are related, Customer[Join Date] with Cal[Date] is currently active
Here are the 3 Measures
1. Cumulative Distinct Count Until this Month = TOTALYTD(DISTINCTCOUNT([Customer Key]),Cal[Date]) It will give you the distinct cumulative count of customers until the current month
2. Terminations Until Previous Month = TOTALYTD(CALCULATE(DISTINCTCOUNT(Customer[Customer Key]),USERELATIONSHIP(Cal[Date],Customer[Term Date]),DATEADD(Cal[Date],-1,MONTH)),Cal[Date]) It will give the cumulative count of people who left until last month
3. Customers Active =[Cumulative Distinct Count Until this Month]-[Terminations Until Previous Month] the difference of both these will give you the number of customers left over for the current month
Here is the snapshot of the result
You can download the Excel PowerPivot File
I can think of doing it in both M and DAX but I also would like to know how it can be done in pure DAX.
Question: Do the join and key term dates always begin at start of month and end at month end?
Proud to be a Super User!
And so I've ended up doing it both in M and in DAX.
M:
let Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTLUN9Q3MjA0BzJN9Y2h7FidaCUjoIgxQtIYRdIYVRJJZywA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [#"customer key" = _t, #"join date" = _t, #"key term date" = _t]), #"Changed Type" = Table.TransformColumnTypes(Source,{{"customer key", Int64.Type}, {"join date", type date}, {"key term date", type date}}), /*generated a list of dates from join to key term dates, not sure if there is a function to generate a list of months so I just divided the day difference between the two dates by 28, I also set the interval to 28 days -#duration (28, 0, 0, 0), then converted each date in the list to end of month, which can cause duplicates so I used List.Distinct to remove possible duplicates */ #"Added Custom" = Table.AddColumn(#"Changed Type", "List Of Dates", each List.Distinct(List.Transform(List.Dates( [join date], (Number.From([key term date] - [join date] ) + 1)/28, #duration(28, 0, 0, 0)), Date.EndOfMonth))), #"Expanded List Of Dates" = Table.ExpandListColumn(#"Added Custom", "List Of Dates"), #"Changed Type1" = Table.TransformColumnTypes(#"Expanded List Of Dates",{{"List Of Dates", type date}}), //removed dates that are beyond key term date #"Filtered Rows" = Table.SelectRows(#"Changed Type1", each ([List Of Dates] <= [key term date])) in #"Filtered Rows"
DAX
DistinctCustomerCountTable = VAR CalendarTable = //create a calendar table CALENDAR ( MIN ( 'Table'[join date ] ), MAX ( 'Table'[key term date] ) ) VAR NotSummarized = //filter the table to Date Check = TRUE FILTER ( ADDCOLUMNS ( //crossjoin original table with CalendarTable CROSSJOIN ( 'Table', CalendarTable ), //add end of month of Date column from CalendarTable "End of Month", EOMONTH ( [Date], 0 ), //add a column to check whether the Date column form CalendarTable is within the range of join and key term dates "Date Check", [Date] >= 'Table'[join date ] && [Date] <= 'Table'[key term date] ), [Date Check] = TRUE () ) RETURN //ommit extra dates and return just end of mohth and the columns from the original table SUMMARIZE ( NotSummarized, 'Table'[customer key], 'Table'[join date ], 'Table'[key term date], [End of Month] )
The two formulas follow a very similar approach. That is to generate a list of dates within the join and key term dates date range. Both initially generate a bunch of dates so I added a logic to get just one date per month.
Anyone out there who has a more elegant solution? I'd love to see yours. 🙂
--update--
changed || to &&
Proud to be a Super User!
Thank you for your response, Sorry I haven't mentioned earlier.
I have around 65 million records in the customer table and Cross join is not feasible for me to do as it will affect the performance.
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
114 | |
99 | |
83 | |
70 | |
60 |
User | Count |
---|---|
150 | |
115 | |
104 | |
89 | |
65 |