Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more.
Get startedGrow your Fabric skills and prepare for the DP-600 certification exam by completing the latest Microsoft Fabric challenge.
Hi All,
Would you please help in following scenario?
I have a table that contains ages and city information as follow.
I want to create the normal distribution for the age column. The formula to create normal distribution is as follows
NORM.DIST(X, Mean, Standard_dev, Cumulative)
I created the calculated column to calculate the normal distribution of complete data set and it worked fine however the actual problem arises when I use a filter let’s say city filter. Here the problem is that normal distribution depends on the average and standard deviation, and these calculation will be different for complete dataset and for subset of data. And in that case calculated column would not work as these are static and pre-calculate the normal distribution for the entire dataset.
The other alternative that comes into my mind is create a measure that return a table that consist of subset of data along with its normal distribution value. The measure should return the data depend on slicer value. e.g if we select Barcelona from city slicer then here is what I am expecting to return by that measure.
City | age | n.d |
Barcelona | 0 | 0.002 |
Barcelona | 0 | 0.002 |
Barcelona | 7 | 0.002 |
Barcelona | 33 | 0.001 |
Barcelona | 33 | 0 |
The end goal is to use normal distribution along age in the chart. If this is not possible by measure then would it be possible to create a virtual table?
Any pointers/ help from community would be highly appreciated.
Best
zeebee
Solved! Go to Solution.
@zebra
I've adjusted the calculation. Kindly review the attached file. The chart now aligns with Barcelona as expected in your results. I utilized MAX to capture the current value. The choice of MIN, SUM, or AVG would also suffice since we're dealing with a single value. The crucial aspect is employing an aggregation function to obtain the filter context value.
Below is the DAX calculation for NORM.DIST. It now specifically takes into account the current selection by using ALLSELECTED.
Norm.Dist =
VAR __Mean = CALCULATE( AVERAGE(Sheet1[Age]) , ALLSELECTED( Sheet1 ))
VAR __StdDev = CALCULATE( STDEV.S(Sheet1[Age]) , ALLSELECTED( Sheet1 ))
VAR __Result = NORM.DIST( AVERAGE( Sheet1[Age]) ,__Mean ,__StdDev,FALSE())
RETURN
__Result
⭕ Subscribe and learn Power BI from these videos
⚪ Website ⚪ LinkedIn ⚪ PBI User Group
@zebra
To use any column on a chart, it has to come from your model, virtaul table column cannot be used. But if you want to use the data generated for Normal Distribution and aggregate it, then a measure would work. Check my example below, I created a virtual table of the Norm.Dist but that column has to aggregated to use on a measure.
Norm.Dist =
VAR __Age = SELECTCOLUMNS( Table20 , Table20[age] )
VAR __Mean = AVERAGE( Table20[age] )
VAR __StdDev = STDEV.S( Table20[age] )
VAR __nd_Table =
ADDCOLUMNS(
SELECTCOLUMNS( Table20 , Table20[age] ),
"@nd" ,
NORM.DIST( Table20[age] ,__Mean ,__StdDev,FALSE())
)
RETURN
MAXX( __nd_Table , [@nd] )
⭕ Subscribe and learn Power BI from these videos
⚪ Website ⚪ LinkedIn ⚪ PBI User Group
@Fowmy , I appreiciate your post however as I mentioned above, normal distribution needs to draw on a chart and it needs to be calculated dynamically based on slicer selection, the aggregation is not required. This is quite simple requirement and I am amazed to know that power BI can not handle such basic calculation that is return a non scalar measure to be use by a graph. I believe this feature should be requested if unavailable right now.
any other alternative you can thought of?
@zebra
Please check the attached file:
Norm.Dist =
VAR __Age = ALLSELECTED( Sheet1[age] )
VAR __Mean = AVERAGEX( __Age , Sheet1[age] )
VAR __StdDev = STDEVX.S( __Age , Sheet1[age] )
RETURN
NORM.DIST( MAX( Sheet1[Age]) ,__Mean ,__StdDev,FALSE())
⭕ Subscribe and learn Power BI from these videos
⚪ Website ⚪ LinkedIn ⚪ PBI User Group
@Fowmy , thanks for your solution, however this is not correct. I said this becuase you are calculating average and standard deviation for whole dataset however we need to calculate it based on subset of data. this can be verified by looking at the barcelona graph which is as follow
it should not be like this. if you create another sheet that contains only barcelona data then you will see the correct graph.
please see the following post how this graph should look like.
in your formula, we only need to select that ages according to slicer selection and then calculate the avereage and std for the selected dataset. one more question why do you use max function in the return statement?
thanks
zeebee
@zebra
I've adjusted the calculation. Kindly review the attached file. The chart now aligns with Barcelona as expected in your results. I utilized MAX to capture the current value. The choice of MIN, SUM, or AVG would also suffice since we're dealing with a single value. The crucial aspect is employing an aggregation function to obtain the filter context value.
Below is the DAX calculation for NORM.DIST. It now specifically takes into account the current selection by using ALLSELECTED.
Norm.Dist =
VAR __Mean = CALCULATE( AVERAGE(Sheet1[Age]) , ALLSELECTED( Sheet1 ))
VAR __StdDev = CALCULATE( STDEV.S(Sheet1[Age]) , ALLSELECTED( Sheet1 ))
VAR __Result = NORM.DIST( AVERAGE( Sheet1[Age]) ,__Mean ,__StdDev,FALSE())
RETURN
__Result
⭕ Subscribe and learn Power BI from these videos
⚪ Website ⚪ LinkedIn ⚪ PBI User Group
thanks @Fowmy , this is exactly I was looking for. I am asking few questions just for my knowledge in the following code.
Norm.Dist = VAR __Mean = CALCULATE( AVERAGE(Sheet1[Age]) , ALLSELECTED( Sheet1 )) VAR __StdDev = CALCULATE( STDEV.S(Sheet1[Age]) , ALLSELECTED( Sheet1 )) VAR __Result = NORM.DIST( AVERAGE( Sheet1[Age]) ,__Mean ,__StdDev,FALSE()) RETURN __Result
I understand the ALLSELECTED is used to select the current selection from sheet1 in calculating mean and std dev. however in calculating NORM.DIST(AVERAGE(Sheet[Age]) , how we are restricting here to use only those points that are currently selected? I knew that we are applying this funtion to a single value however how this single value is from the selected list?
Hope I explain the question well.
thanks
zeebee
@zebra
Appreceae it!
In response to your query about employing AVERAGE on the Age column, it's crucial to note that the NORM.DIST function requires a scalar value as its first parameter, representing the value for which the distribution is sought. The X-Axis, denoting age, is already filtered based on your canvas selection, such as Spain>Barcelona. By utilizing AVERAGE, I've chosen the present value from this filtered X-Axis. Alternatively, you can opt for any other aggregate function. I trust this clarifies the concept.
You have also posted the this question using a different thread, I can post the same answer their as well and could you accept it as it will be helpful for those who visit that question.
Re: Calculating Normal Distribution based on selec... - Microsoft Fabric Community
⭕ Subscribe and learn Power BI from these videos
⚪ Website ⚪ LinkedIn ⚪ PBI User Group
sure. please post and i will accept it as a solution.
@zebra
I posted the reply: Re: Calculating Normal Distribution based on selec... - Microsoft Fabric Community
⭕ Subscribe and learn Power BI from these videos
⚪ Website ⚪ LinkedIn ⚪ PBI User Group
@zebra Hey,
you can try this. NORM.DIST function (DAX) - DAX | Microsoft Learn
and slicer based on measure - What are Field Parameters in Power BI? | phData
Thanks
Harish
Did I answer your question? Mark my post as a solution! Appreciate your Kudos !!
please read the question carefully. it looks like that you misunderstood the question. I am providing the link of pibx file . would be good if you send back with correct implementation
@zebra , Based on what I got,
Can you try if this can help
calculate(NORM.DIST(X, Mean, Standard_dev, Cumulative) , all())
this is not working and will not work. are you trying to create a calculated column or a measure? i am attaching pibx file and would appriacte if you can add implmentation in this and send it back
Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.
User | Count |
---|---|
85 | |
85 | |
73 | |
68 | |
63 |
User | Count |
---|---|
214 | |
124 | |
117 | |
82 | |
76 |