skip to main content
Power BI
    • What is Power BI
    • Why Power BI
    • Customer stories
    • Data visuals
    • Security
    • Power BI Desktop
    • Power BI Pro
    • Power BI Premium
    • Power BI Mobile
    • Power BI Embedded
    • Power BI Report Server
  • Pricing
    • Azure + Power BI
    • Microsoft 365 + Power BI
    • Dynamics 365 + Power BI
      • Energy
      • Healthcare
      • Manufacturing
      • Media
      • Retail
    • For analysts
    • For IT
      • Overview
      • Embedded analytics
      • Power BI visuals
      • Automation
      • Documentation
      • Community
    • Overview
    • Find consulting services
    • Partner showcase
    • Find a partner
    • Become a partner
    • Instructor-led training
    • Getting started
      • Overview
      • Online workshops
      • Self-guided learning
      • Webinars
      • Documentation
      • Roadmap
      • Overview
      • Issues
      • Give feedback
    • Blog
    • Business intelligence topics
    • Overview
    • Forums
    • Galleries
    • Submit ideas
    • Events
    • User groups
    • Community blog
    • Register
    • ·
    • Sign in
    • ·
    • Help
    Go To
    • Galleries
    • Community Connections & How-To Videos
    • COVID-19 Data Stories Gallery
    • Themes Gallery
    • Data Stories Gallery
    • R Script Showcase
    • Webinars and Video Gallery
    • Quick Measures Gallery
    • 2021 MSBizAppsSummit Gallery
    • 2020 MSBizAppsSummit Gallery
    • 2019 MSBizAppsSummit Gallery
    • Events
    cancel
    Turn on suggestions
    Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.
    Showing results for 
    Search instead for 
    Did you mean: 
    • Microsoft Power BI Community
    • Galleries
    • Quick Measures Gallery
    • Re: Correlation coefficient

    Re: Correlation coefficient

    08-07-2022 06:03 AM

    Super User lbendlin
    Super User
    9220 Views
    LinkedIn LinkedIn Facebook Facebook Twitter Twitter
    Daniil
    Daniil Kudo Kingpin
    Kudo Kingpin
    • Mark as New
    • Bookmark
    • Subscribe
    • Mute
    • Subscribe to RSS Feed
    • Permalink
    • Print
    • Report Inappropriate Content

    Correlation coefficient

    ‎06-18-2017 05:07 PM

    The quick measure calculates the Pearson correlation coefficient between two measures within the category.

     

    NAME:

    Correlation coefficient

     

    DESCRIPTION:

    Calculate the Pearson correlation coefficient between two measures within the category

     

    PARAMETERS:

    Name: Category

    Tooltip: The category in which you want to calculate the correlation coefficient

    Type: Categorical field

     

    Name: Measure X

    Tooltip: The first measure in a correlation pair

    Type: Numerical field / measure

     

    Name: Measure Y

    Tooltip: The second measure in a correlation pair

    Type: Numerical field / measure

     

    DAX:

    Correlation Coefficient :=
    VAR Correlation_Table =
        FILTER (
            ADDCOLUMNS (
                VALUES ( {Category} ),
                "Value_X", CALCULATE ( {Measure X} ),
                "Value_Y", CALCULATE ( {Measure Y} )
            ),
            AND (
                NOT ( ISBLANK ( [Value_X] ) ),
                NOT ( ISBLANK ( [Value_Y] ) )
            )
        )
    VAR Count_Items =
        COUNTROWS ( Correlation_Table )
    VAR Sum_X =
        SUMX ( Correlation_Table, [Value_X] )
    VAR Sum_X2 =
        SUMX ( Correlation_Table, [Value_X] ^ 2 )
    VAR Sum_Y =
        SUMX ( Correlation_Table, [Value_Y] )
    VAR Sum_Y2 =
        SUMX ( Correlation_Table, [Value_Y] ^ 2 )
    VAR Sum_XY =
        SUMX ( Correlation_Table, [Value_X] * [Value_Y] )
    VAR Pearson_Numerator =
        Count_Items * Sum_XY - Sum_X * Sum_Y
    VAR Pearson_Denominator_X =
        Count_Items * Sum_X2 - Sum_X ^ 2
    VAR Pearson_Denominator_Y =
        Count_Items * Sum_Y2 - Sum_Y ^ 2
    VAR Pearson_Denominator =
        SQRT ( Pearson_Denominator_X * Pearson_Denominator_Y )
    RETURN
        DIVIDE ( Pearson_Numerator, Pearson_Denominator )

     

     

    eyJrIjoiMGQ5YzJiYTItZWFiMy00MGI2LTg1NzktYjMwYTU1YjA2N2M3IiwidCI6ImQzMmNkYzNmLTY1NTUtNGNhYy1iYjFhLTg2OWZiMTE0MzRlNSJ9

    Preview file
    6 KB
    Correlation.pbix
    Labels:
    • Labels:
    • Mathematical
    Message 1 of 13
    88,268 Views
    32
    Reply
    • All forum topics
    • Previous Topic
    • Next Topic
    systemnova
    systemnova
    New Member
    • Mark as New
    • Bookmark
    • Subscribe
    • Mute
    • Subscribe to RSS Feed
    • Permalink
    • Print
    • Report Inappropriate Content

    ‎01-17-2021 02:31 PM

    This is wonderfully useful thanks @Daniil 
    I've been trying to modify your DAX to use in a scenario where the variables to be correlated are pivoted and the categories are contained in an attribute column (example below). Sadly I'm not having much luck, does anyone have any suggestions on how to generate a correlation matrix with this data structure?

    Item Attribute Value
    A Value X 2
    B Value X 3
    C Value X 5
    D Value X 7
    A Value Y 0
    B Value Y 1
    C Value Y 1
    D Value Y 2
    Message 8 of 13
    38,105 Views
    0
    Reply
    lbendlin
    Super User lbendlin
    Super User
    In response to systemnova
    • Mark as New
    • Bookmark
    • Subscribe
    • Mute
    • Subscribe to RSS Feed
    • Permalink
    • Print
    • Report Inappropriate Content

    ‎08-10-2021 04:41 AM

    Statistics only work with numbers. You need to add numerical key columns for your text columns before you can calculate the correlation.

    Message 9 of 13
    27,442 Views
    0
    Reply
    brownrice
    brownrice Helper III
    Helper III
    In response to lbendlin
    • Mark as New
    • Bookmark
    • Subscribe
    • Mute
    • Subscribe to RSS Feed
    • Permalink
    • Print
    • Report Inappropriate Content

    ‎08-06-2022 12:59 PM

    Apologies for reviving an old question however I want to check your response to systemnova if you wouldn't mind. 

    Say I wanted to check the correlation between day of the week and temperature and had collected the below data:

     

    IndexCategoryTemp
    1Monday20
    2Monday24
    3Monday19
    4Wednesday15
    5Wednesday28
    6Wednesday12
    7Wednesday20
    8Friday20
    9Friday10

     

    Are you saying I can swap the days of the week for a numerical key such as Monday = 1, Wednesday = 2 and Friday = 3 to give the below and then use this quick measure to produce meaningful results? Unfortunately I am still very new to stats and Power BI. Any direction would be greatly appreciated. Thanks.

     

    IndexCategoryTemp
    1120
    2124
    3119
    4215
    5228
    6212
    7220
    8320
    9310

     

    Message 10 of 13
    9,301 Views
    0
    Reply
    lbendlin
    Super User lbendlin
    Super User
    In response to brownrice
    • Mark as New
    • Bookmark
    • Subscribe
    • Mute
    • Subscribe to RSS Feed
    • Permalink
    • Print
    • Report Inappropriate Content

    ‎08-06-2022 06:43 PM

    yes, that's what I am saying.  Replace text values with their numerical index and compute the correlation over the indexes. Afterwards you can map it back.

     

    (bit unfortunate that your sample table has an "Index" column. Please discard that, it doesn't help)

    Message 11 of 13
    9,275 Views
    0
    Reply
    brownrice
    brownrice Helper III
    Helper III
    In response to lbendlin
    • Mark as New
    • Bookmark
    • Subscribe
    • Mute
    • Subscribe to RSS Feed
    • Permalink
    • Print
    • Report Inappropriate Content

    ‎08-07-2022 12:45 AM

    Discard the Index column? I assumed that would be the 'Category' element of the quick measure. Would it be correct to instead have the following, where the elements of the quick measure are:

     

    Category = Day of Week column

    Measure X = Day of Week Key column
    Measure Y = Temp column

     

    IndexDay of WeekDay of Week KeyTemp
    1Monday120
    2Monday124
    3Monday119
    4Wednesday215
    5Wednesday228
    6Wednesday212
    7Wednesday220
    8Friday320
    9Friday310
    Message 12 of 13
    9,251 Views
    0
    Reply
    lbendlin
    Super User lbendlin
    Super User
    In response to brownrice
    • Mark as New
    • Bookmark
    • Subscribe
    • Mute
    • Subscribe to RSS Feed
    • Permalink
    • Print
    • Report Inappropriate Content

    ‎08-07-2022 06:03 AM

    Ah, the quick measure. I never understood why they added a category there and made it mandatory. It has (in my opinion) nothing to do with the computation.  I always implement the Pearson algorithm manually.

    Message 13 of 13
    9,220 Views
    0
    Reply
    pchapple
    pchapple Helper III
    Helper III
    • Mark as New
    • Bookmark
    • Subscribe
    • Mute
    • Subscribe to RSS Feed
    • Permalink
    • Print
    • Report Inappropriate Content

    ‎03-19-2020 06:04 AM

    Hello everyone,

    A little bit stuck on this one...  Can someone guide me?I have a data set with the following fields...

     

    1. Year
    2. Venue
    3. Event Length(in days)
    4. Average Event revenue per day

     

    I want to see if there's a correlation between even length and event revenue per day.  I thought I could use the year as the category, but I think this might be wrong.  Using the Quick Measure, what Category, and 2 measures should I use to establish the hypotheses that "the longer the event, the more/less menoy per day is made"

     

    Thanks!

    Message 7 of 13
    53,297 Views
    0
    Reply
    PowerBI_77
    PowerBI_77
    Frequent Visitor
    • Mark as New
    • Bookmark
    • Subscribe
    • Mute
    • Subscribe to RSS Feed
    • Permalink
    • Print
    • Report Inappropriate Content

    ‎01-22-2019 03:14 AM

    Thanks for the formula Daniil.  I'm trying to use it to calculate the correlation coefficient on some data I have but I have hit an issue - I only have two columns of data, one showing the month name and one showing the volume (a measure).  

    I.e.

    Month     Volume

    Jan-07      1,000

    Feb-07     1,613

    Mar-07     1,128

    etc.

     

    I don't have two measures as per your data nor do I have a column showing a 'category'.  How would I be able to carry out the calculation for my data please?  Do I need to create a new measure column based on the month name, coverting it into a numerical value?  (just a guess).  Then I would classify the month column as the 'category'?  

     

    Hope this is clear (still classing myself as a Power BI newbie!) 

    Regards,

    Lee

    Message 5 of 13
    76,628 Views
    0
    Reply
    Daniil
    Daniil Kudo Kingpin
    Kudo Kingpin
    In response to PowerBI_77
    • Mark as New
    • Bookmark
    • Subscribe
    • Mute
    • Subscribe to RSS Feed
    • Permalink
    • Print
    • Report Inappropriate Content

    ‎01-22-2019 02:39 PM

    @PowerBI_77 please have a look at the second example in my Simple Linear Regression blog -- you can employ a similar technique here.

     

    Also, in case people still read this -- @acanepa said in a private message there "is not an error of the calculation you created but rather an error on my end to feed the formula with wrong numbers", so please ignore his comment about negative numbers.

    Message 6 of 13
    76,575 Views
    1
    Reply
    ZanderFick
    ZanderFick
    New Member
    • Mark as New
    • Bookmark
    • Subscribe
    • Mute
    • Subscribe to RSS Feed
    • Permalink
    • Print
    • Report Inappropriate Content

    ‎01-24-2018 10:33 PM

    Very handy addition.

    Are there, however, plans to add a measure/some other output feature that will also report on the uncertainty of the Correlation Coefficient calculated for a given series pair (i.e. implementing Fisher's z-transformation and evaluating the confidence interval at difference levels that the user chooses, or just a standard set of levels like 80%, 90 % and 95%)

    The risk is that people could state (and frequently do state) correlation coefficients for insufficiently sized samples and derive insights that are actually attributable to noise etc.

    Thanks for the awesome work! Smiley Very Happy

    Message 4 of 13
    85,685 Views
    0
    Reply
    acanepa
    acanepa Resolver I
    Resolver I
    • Mark as New
    • Bookmark
    • Subscribe
    • Mute
    • Subscribe to RSS Feed
    • Permalink
    • Print
    • Report Inappropriate Content

    ‎11-15-2017 09:59 AM

    Hello @Daniil,

     

     

    Thanks for this, this a great solution for correlation calculations on DAX.

     

    I have tried the calculation with different measures and I will add the following improvement. In the following line, you could get a negative number.

     

    VAR Pearson_Denominator =
        SQRT ( Pearson_Denominator_X * Pearson_Denominator_Y )

    I suggest this slight modification to run in all scenarios.

     

    VAR Pearson_Denominator =
        SQRT(ABS( Pearson_Denominator_X * Pearson_Denominator_Y ))

     

    Regards,

    Message 2 of 13
    87,226 Views
    3
    Reply
    numbus
    numbus
    Frequent Visitor
    In response to acanepa
    • Mark as New
    • Bookmark
    • Subscribe
    • Mute
    • Subscribe to RSS Feed
    • Permalink
    • Print
    • Report Inappropriate Content

    ‎11-26-2017 11:01 AM

    Hi acanepa

     

    Good programming practice would suggest avoiding a division by zero, however you need to think like a statistician - which can often be counter intuitive!

     

    You actually want the Pearson Coefficient to "fail" when you divide by zero.

    See this post for more info https://stackoverflow.com/questions/38548343/pearson-correlation-fails-for-perfectly-correlated-sets

     

    From a DAX point of view the divide function will tolerate a division by zero.

     

    Regards

    Graeme

    Message 3 of 13
    87,136 Views
    0
    Reply

    Power Platform

    • Overview
    • Power BI
    • Power Apps
    • Power Pages
    • Power Automate
    • Power Virtual Agents

    • Sign in
    • Sign up

    Browse

    • Solutions
    • Partners
    • Consulting Services

    Downloads

    • Power BI Desktop
    • Power BI Mobile
    • Power BI Report Server
    • See all downloads

    Learn

    • Guided learning
    • Documentation
    • Support
    • Community
    • Give feedback
    • Webinars
    • Developers
    • Blog
    • Newsletter

    © 2023 Microsoft

    Follow Power BI

    • Privacy & cookies
    • Manage cookies
    • Terms of use
    • Trademarks
    California Consumer Privacy Act (CCPA) Opt-Out Icon Your California Privacy Choices