Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
OK, been working with school data and can't seem to wrap my head around one particular issue so hopefully someone can help out.
Basically, what I want to do is to calculate how many standard deviations each school district is away from the average score of all school districts but have it so that the average score and the standard deviation is influenced by what category the school district falls into.
So, for example, let's say that school districts have an average score of 2 and the standard deviation is .5. For all rural schools only, the average score is 2.75 and the standard deviation is .25. If I therefore have a school that ranks a 3, the resulting standard deviations from mean would be 2 if looking at all schools ((3 - 2)/.5) and would be .5 for rural schools only ((3 - 2.75)/.25). Hopefully that makes sense. The idea here is that this metric can be used to find the outliers, those schools that do better than their peers and thus they can be studied to see what they do differently than the other schools in order to help all students do better in those other schools.
The issue is that I can't seem to get it to work the way I want. I can calculate the mean and stddev easy enough and I can create my measure easy enough but it doesn't do what I want because the visualization filters it to each school individually, which isn't helpful in this case. So, I tried measures like:
Overall Score StdDevs from Mean = (MAX([OVERALL SCORE]) - CALCULATE([Overall Score District Mean],ALL(DISTRICT))) / CALCULATE([Overall Score District StdDev],ALL(DISTRICT))
The problem with the above measure is that this score is not impacted by the school category when I select one in a slicer. The one below:
Overall Score StdDevs from Mean = (MAX([OVERALL SCORE]) - CALCULATE([Overall Score District Mean],ALLEXCEPT(TYPOLOGIES,TYPOLOGIES[ Major Grouping]))) / CALCULATE([Overall Score District StdDev],ALLEXCEPT(TYPOLOGIES,TYPOLOGIES[ Major Grouping]))
This returns NaN.
And this version also remains constant regardless.
Overall Score StdDevs from Mean = (MAX(DISTRICT[OVERALL SCORE]) - CALCULATE([Overall Score District Mean],ALL(DISTRICT),ALLEXCEPT(TYPOLOGIES,TYPOLOGIES[ Major Grouping]))) / CALCULATE([Overall Score District StdDev],ALL(DISTRICT),ALLEXCEPT(TYPOLOGIES,TYPOLOGIES[ Major Grouping]))
It seems to me that there has to be an easy solution to this that I am just missing but I can't wrap my head around how to get the result I want. Perhaps I am going down the wrong path here and there is a different method to achieve what I want?
So, here is some sample data:
DISTRICT
DISTRICT IRN | DISTRICT NAME | OVERALL SCORE |
442 | Manchester Local | 2.75 |
43489 | Akron City | 1.31 |
43497 | Alliance City | 1.63 |
43505 | Ashland City | 2.63 |
43513 | Ashtabula Area City | 1.94 |
43521 | Athens City | 2.69 |
43539 | Barberton City | 1.69 |
43547 | Bay Village City | 4.19 |
43554 | Beachwood City | 3.63 |
43562 | Bedford City | 1.69 |
43570 | Bellaire Local | 1.69 |
43588 | Bellefontaine City | 3.25 |
43596 | Bellevue City | 2.94 |
43604 | Belpre City | 2.19 |
43612 | Berea City | 1.94 |
43620 | Bexley City | 2.88 |
43638 | Bowling Green City | 3.25 |
43646 | Brecksville-Broadview Heights City | 4.19 |
43653 | Brooklyn City | 2.69 |
43661 | Brunswick City | 3.5 |
43679 | Bryan City | 2.94 |
43687 | Bucyrus City | 1.81 |
43695 | Cambridge City | 2.75 |
43703 | Campbell City | 1.75 |
43711 | Canton City | 1.56 |
TYPOLOGY BY DISTRICT
IRN | District Name | 2013 Typology |
442 | Manchester Local (Adams) | 1 |
43489 | Akron City | 8 |
43497 | Alliance City | 7 |
43505 | Ashland City | 4 |
43513 | Ashtabula Area City | 4 |
43521 | Athens City | 5 |
43539 | Barberton City | 7 |
43547 | Bay Village City | 6 |
43554 | Beachwood City | 6 |
43562 | Bedford City | 7 |
43570 | Bellaire Local | 4 |
43588 | Bellefontaine City | 4 |
43596 | Bellevue City | 1 |
43604 | Belpre City | 4 |
43612 | Berea City | 7 |
43620 | Bexley City | 6 |
43638 | Bowling Green City | 5 |
43646 | Brecksville-Broadview Heights City | 6 |
43653 | Brooklyn City | 7 |
43661 | Brunswick City | 5 |
43679 | Bryan City | 4 |
43687 | Bucyrus City | 4 |
43695 | Cambridge City | 4 |
43703 | Campbell City | 7 |
43711 | Canton City | 8 |
Topologies
2013 Typology Code | Major Grouping |
0 | Island |
1 | Rural |
2 | Rural |
3 | Small Town |
4 | Small Town |
5 | Suburban |
6 | Suburban |
7 | Urban |
8 | Urban |
I built these measures (in DISTRICT):
Overall Score District Mean = AVERAGE(DISTRICT[OVERALL SCORE]) Overall Score District StdDev = STDEV.P(DISTRICT[OVERALL SCORE])
Here are the relationships
Solved! Go to Solution.
Thanks for the suggestion @v-haibl-msft. I actually solved this over the weekend a different way but I'll see if I can find the time to go back and try that method. What I ended up doing was merging my three tables into a single table and then wrapping the calculation of the average in a CALCULATE statement with an ALLEXCEPT clause that included the two categories that I wanted to factor into the calculation dynamically. Worked as expected once the data was all in a single table. You can actually see the end result here in the Data Stories Gallery:
I just published it last night. Thanks again @KGrice for taking a look at this also.
Hi @KGrice,
I have almost the same problem as Greg but I've had to calculate my StdDev via the below using Summarize and Calculatetable. My data is a tranactional table, standard deviation is required at the monthly level with filtering to two seperate hierarchies built on location. I need the standard deviation (based on month) to dynmaically reflect the filters.
Hi @Greg_Deckler. I think this works, but I'm not sure if you're wanting to show your rural and not rural scores in the same measure depending on where the shool is, or if they should be separate. Here's what I have now based on the sample data:
And here are the measures in the order they appear:
Overall Score District Mean = AVERAGE(DISTRICT[OVERALL SCORE])
Overall Score District StdDev = STDEV.P(DISTRICT[OVERALL SCORE])
Mean All Districts = CALCULATE(
[Overall Score District Mean],
ALL(District[DISTRICT NAME])
)
StdDev All Districts = CALCULATE(
[Overall Score District StdDev],
ALL(District[DISTRICT NAME])
)
Overall Score StdDevs from Mean =
([Overall Score District Mean] - [Mean All Districts]) / [StdDev All Districts]
Mean All Districts Rural = CALCULATE(
[Mean All Districts],
Topologies[ Major Grouping]="Rural"
)
StdDev All Districts Rural = CALCULATE(
[StdDev All Districts],
Topologies[ Major Grouping]="Rural"
)
Overall Score StdDevs from Mean Rural = IF(
HASONEVALUE(Topologies[ Major Grouping]),
IF(
VALUES(Topologies[ Major Grouping])="Rural",
([Overall Score District Mean] - [Mean All Districts Rural]) / [StdDev All Districts Rural],
BLANK()
),
([Overall Score District Mean Rural] - [Mean All Districts Rural]) / [StdDev All Districts Rural]
)
Hi @KGrice,
I have almost the same problem as Greg but I've had to calculate my StdDev via the below using Summarize and Calculatetable. My data is a tranactional table, standard deviation is required at the monthly level with filtering to two seperate hierarchies built on location. I need the standard deviation (based on month) to dynmaically reflect the filters.
Thanks @KGrice. This definitely works and essentially what I asked for, but not what I want! 🙂
I was *hoping* that I could make this measure dynamic such that I could put a slicer on the page and when the user clicked the slicer, it calculated things correctly.
The end go is that I will probably have two categories, "rural", "urban", etc. and then a category on level of poverty "<= 10%", "11%-20%".
Sorry, I feel like I'm whining... 🙂
Maybe you can try to use the ALLSELECTED function to make the measure dynamic according to your slicer.
Best Regards,
Herbert
Thanks for the suggestion @v-haibl-msft. I actually solved this over the weekend a different way but I'll see if I can find the time to go back and try that method. What I ended up doing was merging my three tables into a single table and then wrapping the calculation of the average in a CALCULATE statement with an ALLEXCEPT clause that included the two categories that I wanted to factor into the calculation dynamically. Worked as expected once the data was all in a single table. You can actually see the end result here in the Data Stories Gallery:
I just published it last night. Thanks again @KGrice for taking a look at this also.
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
110 | |
95 | |
76 | |
65 | |
51 |
User | Count |
---|---|
146 | |
109 | |
106 | |
88 | |
61 |