Solved: Re: Breaking down the normal distribution by attri...

Anonymous · ‎02-14-2020

Hey,

We have just conducted a survey and I'm looking to understand if the respondents who answered the survey are representitive of our population as a whole. Mainly, are we getting the same proportion of learner in our sample as in the population at large

I want to conduct a two tailed Significance Test for a Proportion. I have started by looking to lift the Norm.Dist() formula from Excel.. but I can't get it to break down by sector

for the left tailed hypothesis

NORM.DIST([Percentage of respondents by sector],[Percentage in population by sector],COUNTROWS(ALLSELECTED('Learner Experience Baseline data')),1)

But when drop this measure in a table, all sectors return the same result.

Any ideas?

v-lionel-msft · ‎02-19-2020

Hi @Anonymous ,

First of all, your model relationship may be a bit problematic, it should be like this:

Secondly, the results returned by your two percentage measures may have problems, please refer to the following formula:

Percent of respondents = 
VAR x = 
CALCULATE(
    DISTINCTCOUNT('Sample'[External Data Reference]),
    ALLEXCEPT(
        'Sample',
        'Sample'[Sector]
    )
)
VAR y = 
CALCULATE(
    DISTINCTCOUNT('Sample'[External Data Reference]),
    ALL('Sample')
)
RETURN
DIVIDE(
    x,
    y,
    BLANK()
)

Percent of total Population = 
VAR x = 
CALCULATE(
    DISTINCTCOUNT(Population[ID]),
    ALLEXCEPT(
        'Population',
        'Population'[Sector]
    )
)
VAR y = 
CALCULATE(
    DISTINCTCOUNT(Population[ID]),
    ALLSELECTED('Sample'[Sector])
)
RETURN
 DIVIDE(
    x,
    y,
    BLANK()
)

Left_Tailed_Hypothesis = 
NORM.DIST(
    [Percent of respondents], 
    [Percent of total Population],
    1,
    TRUE()
)

The value of COUNTROWS(ALLSELECTED('Sample')) is too big, which is not suitable for Standard_dev parameters.

And its value is equal to a fixed value 899.

You can refer to https://docs.microsoft.com/en-us/dax/norm-dist-dax to learn NORM.DIST() function.

Best regards,
Lionel Chen

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

v-lionel-msft · ‎02-16-2020

Hi @Anonymous ,

As @sturlaws said, we may need the formulas of [Percentage of respondents by sector] and [Percentage in population by sector] to research the problem.

Best regards,
Lionel Chen

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

sturlaws · ‎02-16-2020

Hi, @Anonymous

I assume [Percentage of respondents by sector] and [Percentage in population by sector] are measures. Could provide a screeen shot of a table visual with the sectors on the rows, with the measures and the norm distribution as values?

What are you do want to achieve with the ALLSELECTED-statement? The ALLSELECTED is a tricky function, and sometimes yields unexpected results.

Cheers,
Sturla

Anonymous · ‎02-17-2020

hey @sturlaws ,

thanks for the response and your help, the [Percentage of...] are measures and yes I am using an ALLSELECTED(). I have posted the formulas and table below

Percentage of respondents = DIVIDE(

[Number respondents],

CALCULATE(

[Number respondents],

ALLSELECTED('sample data')))

Percentage of total Population = divide([Number in Population],CALCULATE([Number in Population],ALLSELECTED('Population Data')))

Number respondents = DISTINCTCOUNT('Sample data'[Response ID])

Number in Population = DISTINCTCOUNT('Population Data'[ID])

sturlaws · ‎02-17-2020

Sorry, really bad spelling on my behalf in my previous post. What I meant to ask was, why have you included ALLSELECTED in your measures?

I created a sample report, and I can't replicate the behaviour you are experiencing. Could you share your report, or create sample report where you reproduce the behaviour you are experiencing?

Cheers,
Sturla

Anonymous · ‎02-17-2020

Hey @sturlaws ,

I was using the ALLSELECT() to gain a dynamic percentage of total.

I've created a dummy replicar in the PBI file where I am still experiencing the same issue :(... can't seem to figure out how to attach it to the thred though?

Thanks again!

Polly

sturlaws · ‎02-17-2020

Adding files in this forum is a bit awkward, you have to upload the file to onedrive/dropbox/other and share the link here

Anonymous · ‎02-18-2020

Hi @sturlaws ,

Ah i see, here you go go https://www.dropbox.com/s/chdqh4454426nvt/Normal%20Distribution.pbix?dl=0

Thanks,

Polly

sturlaws · ‎02-19-2020

Great.

If you don't have it already in your report, create a section-table/dimension, and change you measures to this:

Percentage of respondents =
DIVIDE (
    [Number respondents];
    CALCULATE ( [Number respondents]; ALL ( 'dimSector' ) )
)

And as @v-lionel-msft mentions, the value you are using for standard deviation does not make any sense. I can't really see how you should calculate the standard deviation the way you are trying to do. I am not that experienced in statistics, but is perhaps the chi-square test is more appropriate?

Cheers,
Sturla

v-lionel-msft · ‎02-19-2020

Hi @Anonymous ,

First of all, your model relationship may be a bit problematic, it should be like this:

Secondly, the results returned by your two percentage measures may have problems, please refer to the following formula:

Percent of respondents = 
VAR x = 
CALCULATE(
    DISTINCTCOUNT('Sample'[External Data Reference]),
    ALLEXCEPT(
        'Sample',
        'Sample'[Sector]
    )
)
VAR y = 
CALCULATE(
    DISTINCTCOUNT('Sample'[External Data Reference]),
    ALL('Sample')
)
RETURN
DIVIDE(
    x,
    y,
    BLANK()
)

Percent of total Population = 
VAR x = 
CALCULATE(
    DISTINCTCOUNT(Population[ID]),
    ALLEXCEPT(
        'Population',
        'Population'[Sector]
    )
)
VAR y = 
CALCULATE(
    DISTINCTCOUNT(Population[ID]),
    ALLSELECTED('Sample'[Sector])
)
RETURN
 DIVIDE(
    x,
    y,
    BLANK()
)

Left_Tailed_Hypothesis = 
NORM.DIST(
    [Percent of respondents], 
    [Percent of total Population],
    1,
    TRUE()
)

The value of COUNTROWS(ALLSELECTED('Sample')) is too big, which is not suitable for Standard_dev parameters.

And its value is equal to a fixed value 899.

You can refer to https://docs.microsoft.com/en-us/dax/norm-dist-dax to learn NORM.DIST() function.

Best regards,
Lionel Chen

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Breaking down the normal distribution by attributes

Helpful resources

Microsoft Fabric Learn Together

Power BI Monthly Update - April 2024

Fabric Community Update - April 2024

How to Get Your Question Answered Quickly