Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The ultimate Microsoft Fabric, Power BI, Azure AI & SQL learning event! Join us in Las Vegas from March 26-28, 2024. Use code MSCUST for a $100 discount. Register Now

Reply
Buzzshot43
Frequent Visitor

KM Survival Curve in Power BI

Hi,

 

I was wondering if it's possible to recreate this guide and functionality in PowerBI:

 

https://www.linkedin.com/pulse/survival-curves-tableau-hr-data-chris-short/

 

Thank you. 

1 ACCEPTED SOLUTION
Greg_Deckler
Super User
Super User

@Buzzshot43 - I have the full solution published to a blog article here:

https://www.linkedin.com/pulse/kaplan-meier-survival-curves-power-bi-greg-deckler-microsoft-mvp-

 


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Mastering Power BI 2nd Edition

DAX is easy, CALCULATE makes DAX hard...

View solution in original post

5 REPLIES 5
Greg_Deckler
Super User
Super User

@Buzzshot43 - I have the full solution published to a blog article here:

https://www.linkedin.com/pulse/kaplan-meier-survival-curves-power-bi-greg-deckler-microsoft-mvp-

 


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Mastering Power BI 2nd Edition

DAX is easy, CALCULATE makes DAX hard...
Greg_Deckler
Super User
Super User

Yes, that seems fairly basic. Will see if I can mock up some data and recreate that in Power BI DAX but definitely possible.


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Mastering Power BI 2nd Edition

DAX is easy, CALCULATE makes DAX hard...

Thank you very much @Greg_Deckler. It will be greatly appreciated. 

@Buzzshot43, I posted this in another thread but figured I would post it here as well. The other thread is here:

https://community.powerbi.com/t5/Desktop/Kaplan-Meier-Survival-Curves/m-p/330094/highlight/false#M14...

 

This solution is close and I think is a viable way of solving this but I'm not getting exactly the same results so this may need some tweaking. Original article here: https://www.linkedin.com/pulse/survival-curves-tableau-hr-data-chris-short/

 

My ego would have me believe that the original article is in error, but that is probably not true and I have something off in my calculations, if someone could help me find where things are going awry that would be great!! 🙂

 

OK, so I start with the HRData table from the article mentioned above. I'd post it here but that would be too much data. Here are the first 10 rows:

IDNameHire DateTermination DateStatusDepartment
0 1/1/20101/1/2010InactiveHR
0 1/1/20101/1/2010InactiveSales
1Inez Laurie  4/23/20109/2/2015InactiveHR
2Yasuko Scruton  1/20/2010 ActiveHR
3Kristopher Walkes  4/18/2010 ActiveHR
4Geraldine Mclennon  6/12/2010 ActiveHR
5Pilar Willard  1/14/2010 ActiveHR
6Candace Molton  4/19/20104/4/2012InactiveHR
7Evelyne Kneeland  4/22/2010 ActiveHR
8Elois Hires  5/26/20108/11/2013InactiveHR
9Keeley Dewolf  1/11/20105/27/2012InactiveHR
10Melisa Padua  5/16/2010 ActiveHR

 

So, I created a KM table like this:

 

KM = SUMMARIZE(HRData,HRData[Years],"Count",SUM(HRData[Count]),"d_i",SUM(HRData[Event]))

 

Then I added these columns:

 

 

Running = SUMX(FILTER(ALL(KM),[Years]<EARLIER([Years])),KM[Count])

n_i = [Count] + CALCULATE(SUM([Count]),ALL(KM)) - [Running]

d_i/n_i = KM[d_i]/KM[n_i]

1-d_i/n_i = 1-KM[d_i/n_i]

 

 

Then this measure:

 

 

MyKM = 
CALCULATE(
	PRODUCT('KM'[1-d_i/n_i]),
	FILTER(
		ALLSELECTED('KM'[Years]),
		ISONORAFTER('KM'[Years], MAX('KM'[Years]), DESC)
	)
)

 

You can then plot KM[Years] on the x-axis and MyKm on the y-axis in a line chart.

 

OK, so then I wanted to do this by Department, so I created a KMDept table like this:

 

 

KMDept = SUMMARIZE(HRData,HRData[DeptYears],"Years",MAX(HRData[Years]),"Department",MAX(HRData[Department]),"Count",SUM(HRData[Count]),"d_i",SUM(HRData[Event]))

 

And columns like these:

 

 

Running = SUMX(FILTER(ALL(KMDept),[DeptYears]<EARLIER([DeptYears]) && KMDept[Department]=EARLIER(KMDept[Department])),KMDept[Count])

n_i = [Count] + CALCULATE(SUM([Count]),ALL(KMDept),FILTER(KMDept,KMDept[Department]=EARLIER(KMDept[Department]))) - [Running]

d_i/n_i = KMDept[d_i]/KMDept[n_i]

1-d_i/n_i = 1-KMDept[d_i/n_i]

 

And then a measure like this:

 

 

MyKMDept = 
CALCULATE(
	PRODUCT('KMDept'[1-d_i/n_i]),
	FILTER(
		ALLSELECTED('KMDept'[Years]),
		ISONORAFTER('KMDept'[Years], MAX('KMDept'[Years]), DESC)
	)
)

Then, you can plot KMDept[Years] on the x-axis and MyKMDept on the y-axis. 

 

Again, the results are similar, but not 100% the same and not sure what can be done to improve the solution. I'll keep working on it but it's an interesting problem. And if anyone has a better way to solve it, I'm up for it!!


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Mastering Power BI 2nd Edition

DAX is easy, CALCULATE makes DAX hard...

First, sorry, forgot to tell you the original columns I created in HRData table:

 

Years = IF([Status]="Active",DATEDIFF(HRData[Hire Date],TODAY(),YEAR),DATEDIFF(HRData[Hire Date],HRData[Termination Date],YEAR))

Event = IF(HRData[Status] = "Inactive",1,0)

Count = 1

 

And, as a further update to this, first I forgot to mention that after you create the table, you need to relate the two tables together on Years or DeptYears. In addition, on a hunch I created a Days column in the HRData just like the Years column but with DAY specified in the DATEDIFF versus YEAR. I then created a KMDays table just like before and did all of the calculations using this table but everything is in Days and the two tables are related on the Days column. Here is the formula for the initial table.

 

KMDays = SUMMARIZE(HRData,HRData[Days],"Count",SUM(HRData[Count]),"d_i",SUM(HRData[Event]))

After you have that, just add the rest of the column and measure equivalents for that table. When I plot KMDays[Days] and MyKMDays measure, then I get what looks to be the exact results from the original article. At 372 days the measure is 80.26% which seems to be dead on to original article. I'm thinking its the ability of Tableau to generate a true continuous axis that is the main difference when doing this by Year.


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Mastering Power BI 2nd Edition

DAX is easy, CALCULATE makes DAX hard...

Helpful resources

Announcements
Fabric Community Conference

Microsoft Fabric Community Conference

Join us at our first-ever Microsoft Fabric Community Conference, March 26-28, 2024 in Las Vegas with 100+ sessions by community experts and Microsoft engineering.