Correlation coefficient

Member
6579 Views
Highlighted
Member
Posts: 54
Registered: ‎10-15-2016

Correlation coefficient

[ Edited ]

The quick measure calculates the Pearson correlation coefficient between two measures within the category.

 

NAME:

Correlation coefficient

 

DESCRIPTION:

Calculate the Pearson correlation coefficient between two measures within the category

 

PARAMETERS:

Name: Category

Tooltip: The category in which you want to calculate the correlation coefficient

Type: Categorical field

 

Name: Measure X

Tooltip: The first measure in a correlation pair

Type: Numerical field / measure

 

Name: Measure Y

Tooltip: The second measure in a correlation pair

Type: Numerical field / measure

 

DAX:

Correlation Coefficient :=
VAR Correlation_Table =
    FILTER (
        ADDCOLUMNS (
            VALUES ( {Category} ),
            "Value_X", CALCULATE ( {Measure X} ),
            "Value_Y", CALCULATE ( {Measure Y} )
        ),
        AND (
            NOT ( ISBLANK ( [Value_X] ) ),
            NOT ( ISBLANK ( [Value_Y] ) )
        )
    )
VAR Count_Items =
    COUNTROWS ( Correlation_Table )
VAR Sum_X =
    SUMX ( Correlation_Table, [Value_X] )
VAR Sum_X2 =
    SUMX ( Correlation_Table, [Value_X] ^ 2 )
VAR Sum_Y =
    SUMX ( Correlation_Table, [Value_Y] )
VAR Sum_Y2 =
    SUMX ( Correlation_Table, [Value_Y] ^ 2 )
VAR Sum_XY =
    SUMX ( Correlation_Table, [Value_X] * [Value_Y] )
VAR Pearson_Numerator =
    Count_Items * Sum_XY - Sum_X * Sum_Y
VAR Pearson_Denominator_X =
    Count_Items * Sum_X2 - Sum_X ^ 2
VAR Pearson_Denominator_Y =
    Count_Items * Sum_Y2 - Sum_Y ^ 2
VAR Pearson_Denominator =
    SQRT ( Pearson_Denominator_X * Pearson_Denominator_Y )
RETURN
    DIVIDE ( Pearson_Numerator, Pearson_Denominator )

 

 

Attachment
Member
Posts: 55
Registered: ‎12-28-2015

Re: Correlation coefficient

Hello @Daniil,

 

 

Thanks for this, this a great solution for correlation calculations on DAX.

 

I have tried the calculation with different measures and I will add the following improvement. In the following line, you could get a negative number.

 

VAR Pearson_Denominator =
    SQRT ( Pearson_Denominator_X * Pearson_Denominator_Y )

I suggest this slight modification to run in all scenarios.

 

VAR Pearson_Denominator =
    SQRT(ABS( Pearson_Denominator_X * Pearson_Denominator_Y ))

 

Regards,

Frequent Visitor
Posts: 3
Registered: ‎11-14-2017

Re: Correlation coefficient

Hi acanepa

 

Good programming practice would suggest avoiding a division by zero, however you need to think like a statistician - which can often be counter intuitive!

 

You actually want the Pearson Coefficient to "fail" when you divide by zero.

See this post for more info https://stackoverflow.com/questions/38548343/pearson-correlation-fails-for-perfectly-correlated-sets

 

From a DAX point of view the divide function will tolerate a division by zero.

 

Regards

Graeme

New Member
Posts: 1
Registered: ‎01-24-2018

Re: Correlation coefficient

Very handy addition.

Are there, however, plans to add a measure/some other output feature that will also report on the uncertainty of the Correlation Coefficient calculated for a given series pair (i.e. implementing Fisher's z-transformation and evaluating the confidence interval at difference levels that the user chooses, or just a standard set of levels like 80%, 90 % and 95%)

The risk is that people could state (and frequently do state) correlation coefficients for insufficiently sized samples and derive insights that are actually attributable to noise etc.

Thanks for the awesome work! Smiley Very Happy