Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
send2aj
New Member

Custom correlations or visuals

I have an excel sheet. It has 3 sheets with the following (see sample data below):       
I can provide more data in an excel workbook

1. Worksheet #1 for Number or volume of messages over a 60 min interval for last 60 days
2. Worksheet #2 for Avg hourly Processing times over last 60 days
3. Worksheet #3 is Number of errors over a 60 min interval for last 60 days.

 

I am trying to figure out how to create a correlation matrix and answer the following questions:

  1. Do the increased volume of messages makes it more likely to see increase in errors
  2. Avg Processiing time is correlated to volume of messages
  3. Avg Processiing time is correlated to Number of errors
  4. Or any other observation that the community thinks is an interesting pattern.

Really appreciate standard/custom visuals, methodology and guidance from anyone and all of you.

SAMPLE DATA BELOW:

Volume of messages - Sample Data

Name Date 0_1am 1_2am  2_3am 3_4am 4_5am 5_6am 6_7am  7_8am 8_9am 9_10am 10_11am 11_12am 12_1pm 1_2pm 2_3pm 3_4pm 4_5pm 5_6pm 6_7pm 7_8pm 8_9pm 9_10pm 10_11pm 11_12pm
HIE CCD Target 2016-08-25 00:00:00.000 2 11 15 810 9 15 20 40 90 173 175 194 136 129 166 128 66 20 16 8 14 12 13 9
Xpress Admission ADT to HIE CCD 2016-08-25 00:00:00.000 1043 233 273 1983 283 178 542 864 1499 1597 1837 1757 1365 1431 1623 1083 1054 509 309 301 248 266 192 166

 

Total Proc Time for all messages per hour - Sample data

Name Date 0_1am 1_2am  2_3am 3_4am 4_5am 5_6am 6_7am  7_8am 8_9am 9_10am 10_11am 11_12am 12_1pm 1_2pm 2_3pm 3_4pm 4_5pm 5_6pm 6_7pm 7_8pm 8_9pm 9_10pm 10_11pm 11_12pm
HIE CCD Target 2016-08-25 00:00:00.000 37807 7554 13037 100667 9811 5548 17893 29201 62764 76458 86186 80567 55967 60204 70791 42981 37638 15514 9363 11402 39527 12919 7444 20165
Xpress Admission ADT to HIE CCD 2016-08-25 00:00:00.000 15243 4903 2964 26735 4213 2108 7308 10726 23243 27913 35353 27860 23458 24011 24280 15225 14003 7420 4414 4337 77708 5227 4161 3780

 

Number of errors- Sample Data

Name Date 0_1am 1_2am  2_3am 3_4am 4_5am 5_6am 6_7am  7_8am 8_9am 9_10am 10_11am 11_12am 12_1pm 1_2pm 2_3pm 3_4pm 4_5pm 5_6pm 6_7pm 7_8pm 8_9pm 9_10pm 10_11pm 11_12pm
HIE CCD Target 2016-08-25 00:00:00.000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Xpress Admission ADT to HIE CCD 2016-08-25 00:00:00.000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

1 ACCEPTED SOLUTION
Anonymous
Not applicable

I'm no statistician but, following the sample, I think you're right:

  • Unpivot the values for each hour (i.e. highlight Name and Date, then Unpivot Other Columns),
  • Rename the values columns to reflect the variable - e.g. "Number of errors", "Total proc time", "Volume of messages"; then
  • Merge the three tables on Name and Date and Time Band with your three variables then on each row, unless you want to build table relationships etc..

 

Then add a corrplot visual via R per the sample.  Note that as of yesterday's update, you can alternately add that specific corrplot as a custom "R-powered visual" via https://app.powerbi.com/visuals/  So far though, I've found no way to customise the plot with that custom visual, rather than writing it in R.

 

My quick test seemed to work per below screenshot, though all the zeros for Errors didn't seem to help.  As I say, I'm no statistician - good luck...

 

Corrplot.png

View solution in original post

6 REPLIES 6
Anonymous
Not applicable

Thanks Steve for a Great suggestion . Somehow I feel my data is not primed for this type of analysis. Do I have to unpivot my data an then do the analysis. I wanted to get the format of the raw data correct (pivot vs unpivot). 

 

I am struggling with getting my dataset in right structure for chart & analysis

Anonymous
Not applicable

I'm no statistician but, following the sample, I think you're right:

  • Unpivot the values for each hour (i.e. highlight Name and Date, then Unpivot Other Columns),
  • Rename the values columns to reflect the variable - e.g. "Number of errors", "Total proc time", "Volume of messages"; then
  • Merge the three tables on Name and Date and Time Band with your three variables then on each row, unless you want to build table relationships etc..

 

Then add a corrplot visual via R per the sample.  Note that as of yesterday's update, you can alternately add that specific corrplot as a custom "R-powered visual" via https://app.powerbi.com/visuals/  So far though, I've found no way to customise the plot with that custom visual, rather than writing it in R.

 

My quick test seemed to work per below screenshot, though all the zeros for Errors didn't seem to help.  As I say, I'm no statistician - good luck...

 

Corrplot.png

Hello Steve:

I am trying to merge per your post

Merge the three tables on Name and Date and Time Band with your three variables then on each row, unless you want to build table relationships etc..

 

I unpivot the columns. I used merge query and append query but that is not working. Data is not appending correctly. I wanted to append columns not rows. Any suggestion. Should I just copy and paste.

Anonymous
Not applicable

If you've unpivoted each table, you should have 3 x tables with Name, Date and Attribute as common columns and the variables you are correlating as the last column in each, per sample below for "Number of errors":

Tables.png

 

Your new query to merge those on the common columns should look something like:

let
    Source = Table.NestedJoin(#"Volume of messages",{"Name", "Date", "Attribute"},#"Number of errors",{"Name", "Date", "Attribute"},"NewColumn",JoinKind.Inner),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Volume of messages", type number}}),
    #"Expanded NewColumn" = Table.ExpandTableColumn(#"Changed Type", "NewColumn", {"Number of errors"}, {"Number of errors"}),
    #"Merged Queries" = Table.NestedJoin(#"Expanded NewColumn",{"Name", "Date", "Attribute"},#"Total Proc Time for all messages per hour",{"Name", "Date", "Attribute"},"NewColumn",JoinKind.Inner),
    #"Expanded NewColumn1" = Table.ExpandTableColumn(#"Merged Queries", "NewColumn", {"Total proc time"}, {"Total proc time"}),
    #"Renamed Columns" = Table.RenameColumns(#"Expanded NewColumn1",{{"Number of errors", "Number of errors"}, {"Total proc time", "Total proc time"}, {"Volume of messages", "Volume of messages"}})
in
    #"Renamed Columns"

 

If you click the settings "cog" beside the first Source "Applied Step" at the right, you can see the join settings I used.  I hope that works for you.

Brilliant Steve:

 

Let me try this out and see if it works. The chart is exactly what I am looking for. Truly appreciate your insights.

 

Aj

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.