Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi everyone! I need help with this: I have to categorize workers based on their seasonality. The model is very simple, i'll illustrate it on the table:
ID | Total_Shifts | Month | Year | Year_Month |
1 | 12 | 12 | 2019 | 2019-12 |
2 | 14 | 12 | 2019 | 2019-12 |
3 | 15 | 12 | 2019 | 2019-12 |
1 | 33 | 01 | 2020 | 2020-01 |
2 | 24 | 01 | 2020 | 2020-01 |
3 | 3 | 01 | 2020 | 2020-01 |
So, I have every worker with their ID and the total of shifts they worked every month of the given year (the database covers 3 years with hundreds of workers).
I have to group everyone based on this:
Occasional worker: every worker that works from 15 to 25 shifts a year.
Seasonal worker: every worker that works an average of 5,5 to 20,5 shifts by month, but only if worked for four months or less of the given year.
Sporadic worker: every worker that works an average of 5,5 to 20,5 shifts by month.
Regular: from 15 to 25 shifts by month.
The others just fit into the "Other" category.
I think I have to make this with a calculated column and a bunch of ifs.
If I'd have to make a measure of this I guess it wouldn't be that difficult because the chart in itself groups the values, but I don't know how can I do to group this calculations year by year as it happens with the Occasional worker or how can I use moving months for the Seasonal worker (the one I think is the most difficult).
After categorizing, I have to calculate how many people fit into every category and calculate the average shifts for every category.
I'd very much appreciate all the help you can give me,
Thank you for your time,
Best regards,
Gabriela
Solved! Go to Solution.
Ok, I think I have a better understanding, and I'd do this in Power Query. A simple SWITCH() won't cut it. A more complex one would but you need more table transformations. Thank you for the data. Insanely helpful understanding the issue.
See my PBIX file here. It is linked to your data, so you'll need to change it so it points to the "BD Sample.xlsx" on your computer via the Data Sources icon, Change Source item.
Here is what I did:
if [Total Shifts] >= 15 and [Total Shifts] <= 25 then "Occasional" else
if [Months Worked] =4 and [Average Shifts] >= 5.5 and [Average Shifts] <= 20.5 then "Seasonal" else
if [Months Worked] > 4 and [Average Shifts] >= 5.5 and [Average Shifts] <= 20.5 then "Sporadic" else
if [Average Shifts] >= 15 and [Average Shifts] <=25 then "Regular" else
"Other"
It goes through the 4 IF statements, and if none match, it defaults to "Other."
If my logic is wrong because I misunderstood your requirements, you should be able to modify that code. The way I did it is, as I understood your requirements:
You can change that logic, and add an additional if/then/else line before the "other." Just be careful not to cause any to overlap. For example, if you created a new 5th line of "If months worked = r and average shifts between 15 and 25" that would not work as rule #2 would grab all of those in the 5.5-20.5 range and new rule #5 would only get those between 20.5 and 25.
If you need to do some OR logic, just replace and with or, and you can use parentheses.
if ([Months Worked] < 3 or [Months Worked > 12) and [Total Shifts] < 25
as an example would first return TRUE if months were 1-3 or > 12, and then look at total shifts.
Ping back any questions.
Note: You can do this like you'd do in Excel too perhaps. Do one column per critera, then when done, combine those column logics into a single formula, but the if/then/else logic is pretty easy to construct, but you have to watch out for overlaps.
DAX is for Analysis. Power Query is for Data Modeling
Proud to be a Super User!
MCSA: BI ReportingI don't think you gave us enough info to do this, but I agree, as much as I hate calculated columns, this is what you need. And you want to use SWITCH() here.
Here is an example of how this would work:
Category =
VAR CurrentID = [ID]
VAR AverageShifts =
AVERAGEX(
FILTER(
'Table',
'Table'[ID]=CurrentID
),
'Table'[Total_Shifts]
)
RETURN
SWITCH(
TRUE(),
[Total_Shifts] >= 15 && [Total_Shifts] <= 25, "Occasional",
AverageShifts >= 5.5 && AverageShifts <= 20.5, "Seasonal"
)
That returns the following:
Ignore the "Average Shift" column. I used that to make everything between the first VAR and the RETURN logic that I then copied and pasted into the Category measure as shown above. But you can see worker #1 has 22.5 average shifts, and worker 2 has 19, etc.
But I didn't see how I was supposed to determine how many months in a year they worked with that data. But you can just add more conditions to each SWITCH line like you would in Excel. Use && for AND, and || for OR, and group with parenthesis as needed. THIS && (THAT || SOMETHIGN ELSE) for example
Hopefully that is enough to get you started.
DAX is for Analysis. Power Query is for Data Modeling
Proud to be a Super User!
MCSA: BI ReportingHi @edhans
I tried your solution and though the code works it doesn't quite fits to what I need: the Occasional worker is the one that works from 15 to 25 shifts a year; and the Seasonal worker is the one that worked that amount of average shifts but for four months or less so for example if a worker worked 6 shifts every month of the year he would be Sporadic (AVG=6) but if he worked 20 shifts for only four months he would be Seasonal (AVG = 20*4/12 = 6,6) he would be Seasonal. The database considers more than one year, and hundreds of workers, so I have to take account of the years. I share a sample DB Sample
Best regards,
Gabriela
Ok, I think I have a better understanding, and I'd do this in Power Query. A simple SWITCH() won't cut it. A more complex one would but you need more table transformations. Thank you for the data. Insanely helpful understanding the issue.
See my PBIX file here. It is linked to your data, so you'll need to change it so it points to the "BD Sample.xlsx" on your computer via the Data Sources icon, Change Source item.
Here is what I did:
if [Total Shifts] >= 15 and [Total Shifts] <= 25 then "Occasional" else
if [Months Worked] =4 and [Average Shifts] >= 5.5 and [Average Shifts] <= 20.5 then "Seasonal" else
if [Months Worked] > 4 and [Average Shifts] >= 5.5 and [Average Shifts] <= 20.5 then "Sporadic" else
if [Average Shifts] >= 15 and [Average Shifts] <=25 then "Regular" else
"Other"
It goes through the 4 IF statements, and if none match, it defaults to "Other."
If my logic is wrong because I misunderstood your requirements, you should be able to modify that code. The way I did it is, as I understood your requirements:
You can change that logic, and add an additional if/then/else line before the "other." Just be careful not to cause any to overlap. For example, if you created a new 5th line of "If months worked = r and average shifts between 15 and 25" that would not work as rule #2 would grab all of those in the 5.5-20.5 range and new rule #5 would only get those between 20.5 and 25.
If you need to do some OR logic, just replace and with or, and you can use parentheses.
if ([Months Worked] < 3 or [Months Worked > 12) and [Total Shifts] < 25
as an example would first return TRUE if months were 1-3 or > 12, and then look at total shifts.
Ping back any questions.
Note: You can do this like you'd do in Excel too perhaps. Do one column per critera, then when done, combine those column logics into a single formula, but the if/then/else logic is pretty easy to construct, but you have to watch out for overlaps.
DAX is for Analysis. Power Query is for Data Modeling
Proud to be a Super User!
MCSA: BI ReportingHi @edhans !! thank you very much for the dedication and for explaining step by step, I didn't know you could do all this on Power Query it seems a lot more simple.
I tried to do it on DAX mixing a bunch of other formulas I found and got a solution with a calculated column, here it is:
Category =
VAR ANUAL_SUM =
CALCULATE(SUM('DAX_Test'[Total Shifts]),
FILTER(ALLEXCEPT('DAX_Test', 'DAX_Test'[Year]),
'DAX_Test'[ID]=EARLIER('DAX_Test'[ID])
)
)
VAR ANUAL_AVG = ANUAL_SUM/12
VAR WORKEDMONTHS =
CALCULATE(DISTINCTCOUNT('DAX_Test'[Month]),
FILTER(ALLEXCEPT('DAX_Test','DAX_Test'[Year]),
'DAX_Test'[ID]=EARLIER('DAX_Test'[ID])
)
)
RETURN
/*Occasional: 15 to 25 shifts per year*/
if(
(ANUAL_SUM <= 25 && ANUAL_SUM >= 15) ,
"Occasional" ,
/*Seasonal: average between 5,5 to 20,5 per month on 4 months or less*/
if(
(ANUAL_AVG <= 20,5 && ANUAL_AVG >= 5,5) && (WORKEDMONTHS <= 4 ) ,
"Seasonal",
/*Sporadic: 5,5 to 20,5 shifts per month*/
if(
ANUAL_AVG <= 20,5 && ANUAL_AVG >= 5,5 ,
"Sporadic" ,
/*Regular: 9 to 24 shifts per month*/
if(
(WORKEDMONTHS >= 10) && (MAX('DAX_Test'[Total Shifts]) <= 24 && MIN('DAX_Test'[Total Shifts]) >= 9) ,
"Regular" ,
"Other"
)
)
)
)
I validated every category with your solution and mine and got some differences that I changed on Power Query and finally got a perfect match for Sporadic, Occasional and Seasonal but not for Regular. The differences were:
- The average I have to calculate takes no account of how many months the person worked, it always considers 12. This I changed and worked well.
- The Regular worker has to work from 15 to 25 shifts per month every month of the year except for two to consider vacations, so I solve that with an "and 'months worked' >=10", but couldn't figure out how to put that the range for every month must be between 15 to 25, that's different from the average and I wouldn't know how to add it on Power Query. On DAX I put a MIN and a MAX, but I'm not sure that solves it.
Could you please check this out to see if we can get to a perfect match on Regular too? Well it seems by the data that it will be almost impossible for a worker to get a Regular Status, given so many rules, but the data from one worker could be change to see if this works.
Here is the link for the pbix file
Thank you for your time,
Best regards,
Gabriela
Yes, you can do this, but I had to think about it. See my PBIX file again and look at the [Hoja1] query. The original stil lthere in [Hoja1 Original]
I basically had to figure out who was Regular (15-25 shifts) and worked 10+ months that way.
if [All Rows Count] >= 10 then "Regular" else
if [Total Shifts] >= 15 and [Total Shifts] <= 25 then "Occasional" else
if [Months Worked] <= 4 and [Average Shifts] >= 5.5 and [Average Shifts] <= 20.5 then "Seasonal" else
if [Months Worked] > 4 and [Average Shifts] >= 5.5 and [Average Shifts] <= 20.5 then "Sporadic" else
"Other"
Last step was to keep only those columns I needed and remove all of that intermediate junk.
Kudos appreciated on this one if it is helpful. The query was pretty simple, but I had to noodle through the logic of how to get a reliable count since Average wasn't going to assure it met your requirements.
DAX is for Analysis. Power Query is for Data Modeling
Proud to be a Super User!
MCSA: BI ReportingCovering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
114 | |
97 | |
85 | |
70 | |
61 |
User | Count |
---|---|
151 | |
121 | |
104 | |
87 | |
67 |