Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
mahoney
Frequent Visitor

HR Data - Slowly Changing Dimensions

I am analyzing some HR data at our company.  Below is a mock dataset i've been using to figure out these measures.  I was able to do what i needed with calculated columns (Earlier function) but, for academic reasons, I'm trying to also do it with measures.

 

This fact table is a compilation of a snapshot each year for all current employees, which organization (dept) they are part of, and their current level.  I like to create the following measures:

 

1. # of employees that changed Org from the previous year (job changes)

2. # of employees that changed Level from the previous year (promotions)

3. # of employees that joined the company

4. # of employees that left the company

 

I've made some progress but i think my approach iterating SumX over each employee is flawed, and i lose context on my my tables/charts.  It works with Year as the rows, but fails if i use Org, Level, etc.  I ended up trying variables to help keep context, which helped but still problems.

 

dcOrg:=DISTINCTCOUNT(Data[Org])

OrgChange:=Sumx(Values(Data[Employee]), if([dcOrg]=1, 0, [dcOrg]-1))

OrgChanges LastYear=var currentyear=Max(Data[Year]) Return Calculate([OrgChange], Filter(All(Data), Data[Year]=currentyear || Data[Year]=currentyear-1))

 

 

EmployeeYearOrgLevel
12013AP1
22013AP2
32013AP3
42013AP1
52013AP1
62013BP1
72013BP2
82013BP2
92013CP3
102013CP3
12014BP1
22014BP2
32014AP3
42014AP2
52014AP2
62014BP1
72014BP2
82014BP2
92014CP3
102014CP3
112014CP1
12015BP1
22015BP2
32015BP3
42015AP2
52015AP2
62015BP1
72015CP3
82015BP3
92015CP3
112015BP1
12016BP2
22016AP2
32016BP3
42016BP3
52016AP3
62016BP1
72016CP3
92016CP3
112016CP1
122016CP2

 

I also have a Date table (1 year/row).  I am to the point of overthinking it, so looking for help.  In practice, the snapshots are taken each month, and the final fact table will be 2-3 million rows (i will likely thin it out by removing duplicates as most people don't change often).  Given the # of rows, I am looking for a DAX formula that will scale well.

 

Thank you in advance for any help.

1 ACCEPTED SOLUTION
v-shex-msft
Community Support
Community Support

Hi @mahoney,

 

You can refer to below sample formula to achieve your requirement:

 

Sample measures:

Leave = 
var current_year=MAX(Employee[Year])
var previous_List=CALCULATETABLE(VALUES(Employee[Employee]),Employee[Year]=current_year-1)
var current_List=CALCULATETABLE(VALUES(Employee[Employee]),Employee[Year]=current_year)
return
COUNTROWS(EXCEPT(previous_List,current_List))+0

Joined = 
var current_year=MAX(Employee[Year])
var previous_List=CALCULATETABLE(VALUES(Employee[Employee]),Employee[Year]<current_year)
var current_List=CALCULATETABLE(VALUES(Employee[Employee]),Employee[Year]=current_year)
return
COUNTROWS(EXCEPT(current_List,previous_List))+0

LevelChanged = 
var current_year=MAX(Employee[Year])
var previous_List=CALCULATETABLE(SELECTCOLUMNS(ALLSELECTED(Employee),"Employee",Employee[Employee],"Level",Employee[Level]),Employee[Year]=current_year-1)
var current_List=CALCULATETABLE(SELECTCOLUMNS(ALLSELECTED(Employee),"Employee",Employee[Employee],"Level",Employee[Level]),Employee[Year]=current_year)
return
COUNTROWS(previous_List)-COUNTROWS(INTERSECT(current_List,previous_List))+0

OrgChanged = 
var current_year=MAX(Employee[Year])
var previous_List=CALCULATETABLE(SELECTCOLUMNS(ALLSELECTED(Employee),"Employee",Employee[Employee],"Org",Employee[Org]),Employee[Year]=current_year-1)
var current_List=CALCULATETABLE(SELECTCOLUMNS(ALLSELECTED(Employee),"Employee",Employee[Employee],"Org",Employee[Org]),Employee[Year]=current_year)
return
COUNTROWS(previous_List)-COUNTROWS(INTERSECT(current_List,previous_List))+0

6.PNG

 

 

Regards,

Xiaoxin Sheng

Community Support Team _ Xiaoxin
If this post helps, please consider accept as solution to help other members find it more quickly.

View solution in original post

2 REPLIES 2
v-shex-msft
Community Support
Community Support

Hi @mahoney,

 

You can refer to below sample formula to achieve your requirement:

 

Sample measures:

Leave = 
var current_year=MAX(Employee[Year])
var previous_List=CALCULATETABLE(VALUES(Employee[Employee]),Employee[Year]=current_year-1)
var current_List=CALCULATETABLE(VALUES(Employee[Employee]),Employee[Year]=current_year)
return
COUNTROWS(EXCEPT(previous_List,current_List))+0

Joined = 
var current_year=MAX(Employee[Year])
var previous_List=CALCULATETABLE(VALUES(Employee[Employee]),Employee[Year]<current_year)
var current_List=CALCULATETABLE(VALUES(Employee[Employee]),Employee[Year]=current_year)
return
COUNTROWS(EXCEPT(current_List,previous_List))+0

LevelChanged = 
var current_year=MAX(Employee[Year])
var previous_List=CALCULATETABLE(SELECTCOLUMNS(ALLSELECTED(Employee),"Employee",Employee[Employee],"Level",Employee[Level]),Employee[Year]=current_year-1)
var current_List=CALCULATETABLE(SELECTCOLUMNS(ALLSELECTED(Employee),"Employee",Employee[Employee],"Level",Employee[Level]),Employee[Year]=current_year)
return
COUNTROWS(previous_List)-COUNTROWS(INTERSECT(current_List,previous_List))+0

OrgChanged = 
var current_year=MAX(Employee[Year])
var previous_List=CALCULATETABLE(SELECTCOLUMNS(ALLSELECTED(Employee),"Employee",Employee[Employee],"Org",Employee[Org]),Employee[Year]=current_year-1)
var current_List=CALCULATETABLE(SELECTCOLUMNS(ALLSELECTED(Employee),"Employee",Employee[Employee],"Org",Employee[Org]),Employee[Year]=current_year)
return
COUNTROWS(previous_List)-COUNTROWS(INTERSECT(current_List,previous_List))+0

6.PNG

 

 

Regards,

Xiaoxin Sheng

Community Support Team _ Xiaoxin
If this post helps, please consider accept as solution to help other members find it more quickly.

Thank you!  I had also been trying it with Except() and got it working (new employees example below):

Not in Last Year:=if(and(min(Data[Year])=[MinYearAll], DISTINCTCOUNT(Data[Year])=1), blank(), Countrows(EXCEPT(Values(Data[Employee]), CALCULATETABLE(Values(Data[Employee]), Filter(All(Data), Data[Year]=Max(Data[Year])-1)))))

 

I also modified the above with ALLEXCEPT() to go after Org Changes and Level Changes:

Org Changes:=if(COUNTROWS(Filter(ALLEXCEPT(Data,Data[Org]), Data[Year]=Max(Data[Year])-1))>0, blank(), Countrows(EXCEPT(Values(Data[Employee]), CALCULATETABLE(Values(Data[Employee]), Filter(ALLEXCEPT(Data,Data[Org]), Data[Year]=Max(Data[Year])-1)))))

The above worked for Org Changes when Org is used as a filter (row/column/slicer); however, i needed to iterate over Org values to force that context everytime:

Org Changes with Sumx:=Calculate(Sumx(Values(Data[Org]), [Org Changes]), Filter(ALLEXCEPT(Data, Data[Org]), Data[Year]<>[MinYearAll]))-[Not in Last Year]

I am a fan of the five-point-palm function (Sumx(Values(), Measure).

 

Your approach is much better than the above and should scale well (no iterations).  For the Org and Level changes, i only needed to subtract the [Leave] measure as they were getting counted in your measures (e.g., only 3 level changes in 2016).

 

Thank you for the help, and also for the +0 trick in your measures (to avoid blanks).

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.