cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Highlighted
Post Prodigy
Post Prodigy

Correlation between same column, different items for time periods

Afternoon,

 

I'm trying to find the correlation between the [close] column values of the 'StockbarDataExample' table for different companies in the 'StockSymbolExchangeCode' column.

 

I'm not sure how to compare the two time series given they share the same column for values.

 

Pbix & Source Link: https://1drv.ms/f/s!At8Q-ZbRnAj8iF5lSSIdn2oi2tGj

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

@ElliotP

In this sort of "pairwise comparison" scenario, where you have multiple entities in the same table (in this case distinguished by StockSymbolCurrencyI would normally follow the below steps.

My modifed copy of your pbix is here:
https://www.dropbox.com/s/y9h2ncitj46ee6a/PowerBiForumExample2%20Owen%20edit.pbix?dl=0

 

  1. Create a copy of the entity dimension table, in your case a copy of ReferenceTable which I would call ReferenceTableComparison
  2. Create a relationship between StockBarDatExample and ReferenceTableComparison, but make it inactive
  3. Create the appropriate value measure that will be used for the company selected in ReferenceTable. For testing purposes I created
    Average Close = 
    AVERAGE ( StockBarDatExample[close] )
  4. Create the same measure for the Comparison Company, which activates the inactive relationship, and clears the filter on ReferenceTable:
    Average Close Comparison =
    CALCULATE (
        [Average Close],
        ALL ( ReferenceTable ),
        USERELATIONSHIP ( StockBarDatExample[StockSymbolCurrency], ReferenceTableComparison[StockSymbolCurrency] )
    )
  5. You can then selected Company & Comparison Company using slicers, and use Average Close & Average Close Comparison together in visuals.
  6. The Pearson Correlation Coefficient can be calculated using a method similar to that used here. This relies on having the above two measures set up.
    Here is the measure I tested with your data:
    Pearson Correlation Coefficient = 
    VAR DateTimes =
        // Create a table of date/times where both stocks have a Close value
        FILTER (
            SUMMARIZE ( StockBarDatExample, DateTable[DateKey], TimeTable[Column1] ), // Date & Time columns
            AND (
                NOT ( ISBLANK ( [Average Close] ) ),
                NOT ( ISBLANK ( [Average Close Comparison] ) )
            )
        )
    // Construct table of pairs of Close values
    VAR Known =
        SELECTCOLUMNS (
            DateTimes,
            "Known[X]", [Average Close],
            "Known[Y]", [Average Close Comparison]
        )
    // Calculate correlation coefficient
    VAR Count_Items =
        COUNTROWS ( Known )
    VAR Average_X =
        AVERAGEX ( Known, Known[X] )
    VAR Average_X2 =
        AVERAGEX ( Known, Known[X] ^ 2 )
    VAR Average_Y =
        AVERAGEX ( Known, Known[Y] )
    VAR Average_Y2 =
        AVERAGEX ( Known, Known[Y] ^ 2 )
    VAR Average_XY =
        AVERAGEX ( Known, Known[X] * Known[Y] )
    VAR CorrelationCoefficient =
        DIVIDE (
            Average_XY
                - Average_X * Average_Y,
            SQRT ( ( Average_X2 - Average_X ^ 2 ) * ( Average_Y2 - Average_Y ^ 2 ) )
        )
    RETURN
        CorrelationCoefficient

The test report page in the above pbix looks like this:image.png

 

 

 

Hopefully that helps. 🙂

 

Regards,

Owen


Owen Auger

Did I answer your question? Mark my post as a solution!

Connect on Twitter
Connect on LinkedIn

View solution in original post

5 REPLIES 5
Highlighted
Community Support
Community Support

HI @ElliotP,

 

You can use below formula to get diff between two legend:

Diff = 
VAR c1 =
    FIRSTNONBLANK ( ALL ( ReferenceTable[CompanyName] ), [CompanyName] )
VAR c2 =
    LASTNONBLANK ( ALL ( ReferenceTable[CompanyName] ), [CompanyName] )
RETURN
    ABS (
        CALCULATE (
            SUM ( StockBarDatExample[close] ),
            ReferenceTable[CompanyName] = c1
        )
            - CALCULATE (
                SUM ( StockBarDatExample[close] ),
                ReferenceTable[CompanyName] = c2
            )
    )

AFAIK, current line chart not support use multiple value field and legend at same time, I'd like to suggest you use 'line and clustered column chart' to instead.

15.PNG

 

 

Regards,

Xiaoxin Sheng

 

Community Support Team _ Xiaoxin
If this post helps, please consider accept as solution to help other members find it more quickly.
Highlighted

@v-shex-msftThanks for the response.

 

I'm not trying to calculate the difference between the different company's closes, but I think the principles might be able to work. I would like to be able to calculate the correlation between the time series data for each company; but I'm unable to at the moment as both time series values are in the same column ([close]), but they two time series are distinguishable by the 'StockSymbolCurrency' column string value.

 

The idea of using variables to seperate the time series values before returning a correlation figure is interesting, but I'm not sure how to get each variable to filter over subsequent different string values in the 'StockSymbolCurrency' column.

 

Highlighted

@ElliotP

In this sort of "pairwise comparison" scenario, where you have multiple entities in the same table (in this case distinguished by StockSymbolCurrencyI would normally follow the below steps.

My modifed copy of your pbix is here:
https://www.dropbox.com/s/y9h2ncitj46ee6a/PowerBiForumExample2%20Owen%20edit.pbix?dl=0

 

  1. Create a copy of the entity dimension table, in your case a copy of ReferenceTable which I would call ReferenceTableComparison
  2. Create a relationship between StockBarDatExample and ReferenceTableComparison, but make it inactive
  3. Create the appropriate value measure that will be used for the company selected in ReferenceTable. For testing purposes I created
    Average Close = 
    AVERAGE ( StockBarDatExample[close] )
  4. Create the same measure for the Comparison Company, which activates the inactive relationship, and clears the filter on ReferenceTable:
    Average Close Comparison =
    CALCULATE (
        [Average Close],
        ALL ( ReferenceTable ),
        USERELATIONSHIP ( StockBarDatExample[StockSymbolCurrency], ReferenceTableComparison[StockSymbolCurrency] )
    )
  5. You can then selected Company & Comparison Company using slicers, and use Average Close & Average Close Comparison together in visuals.
  6. The Pearson Correlation Coefficient can be calculated using a method similar to that used here. This relies on having the above two measures set up.
    Here is the measure I tested with your data:
    Pearson Correlation Coefficient = 
    VAR DateTimes =
        // Create a table of date/times where both stocks have a Close value
        FILTER (
            SUMMARIZE ( StockBarDatExample, DateTable[DateKey], TimeTable[Column1] ), // Date & Time columns
            AND (
                NOT ( ISBLANK ( [Average Close] ) ),
                NOT ( ISBLANK ( [Average Close Comparison] ) )
            )
        )
    // Construct table of pairs of Close values
    VAR Known =
        SELECTCOLUMNS (
            DateTimes,
            "Known[X]", [Average Close],
            "Known[Y]", [Average Close Comparison]
        )
    // Calculate correlation coefficient
    VAR Count_Items =
        COUNTROWS ( Known )
    VAR Average_X =
        AVERAGEX ( Known, Known[X] )
    VAR Average_X2 =
        AVERAGEX ( Known, Known[X] ^ 2 )
    VAR Average_Y =
        AVERAGEX ( Known, Known[Y] )
    VAR Average_Y2 =
        AVERAGEX ( Known, Known[Y] ^ 2 )
    VAR Average_XY =
        AVERAGEX ( Known, Known[X] * Known[Y] )
    VAR CorrelationCoefficient =
        DIVIDE (
            Average_XY
                - Average_X * Average_Y,
            SQRT ( ( Average_X2 - Average_X ^ 2 ) * ( Average_Y2 - Average_Y ^ 2 ) )
        )
    RETURN
        CorrelationCoefficient

The test report page in the above pbix looks like this:image.png

 

 

 

Hopefully that helps. 🙂

 

Regards,

Owen


Owen Auger

Did I answer your question? Mark my post as a solution!

Connect on Twitter
Connect on LinkedIn

View solution in original post

Highlighted

@OwenAugerThank you so much, that's amazing.

 

As an extension, from an idea, would it be possible to do this for a large number of comparisions? I get the feeling that might be better done with python and then exploring the resultant table and data from that instead of trying to use a tabular model to achieve that?

Highlighted

You're welcome 🙂

There's nothing to stop you having an arbitrary number of stocks in your source table...and you could use a matrix visual to show the correlation of every combination, or use that Correlation Plot custom visual. Or create your own R visual I guess

 

Perhaps performance might be better if you prepare the data with python (or R?) rather than computing on the fly - though don't have much experience with that myself.


Owen Auger

Did I answer your question? Mark my post as a solution!

Connect on Twitter
Connect on LinkedIn

Helpful resources

Announcements
Community Conference

Power Platform Community Conference

Check out the on demand sessions that are available now!

Community Conference

Microsoft Power Platform Communities

Check out the Winners!

secondImage

Create an end-to-end data and analytics solution

Learn how Power BI works with the latest Azure data and analytics innovations at the digital event with Microsoft CEO Satya Nadella.

Top Solution Authors
Top Kudoed Authors