Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
Greg_Deckler
Super User
Super User

Serious Issue with New Correlation Coefficient Quick Measure

OK, in my opinion, there is a serious deficiency in the new Correlation Coefficient Quick Measure. The issue is that it does not factor in the possibility that there are more rows in, say Y than in X in the source data. The original technique I posted here:

 

https://community.powerbi.com/t5/Community-Blog/Correlation-Seasonality-and-Forecasting-with-Power-B...

 

and @Daniil's version of that here:

https://community.powerbi.com/t5/Quick-Measures-Gallery/Correlation-coefficient/m-p/196274

 

does not exhibit these issues because they both either use measures that honor relationships or create a table that honors those relationships. In this way, unrelated rows are automatically weeded out and not factored into the correlation.

 

This can be easily seen by using the data from my original article on the subject. There are more wage rows than forecast rows. So, there is an unmatched Wages row. The SUM of Y given by the new correlation coefficient gives 242 for this data when it should, in fact, be 222. Big issue because this causes the correlation calculation to fail.

 

Overall, seems like a sloppy implementation. I would refactor it to do exactly what @Daniil did and create a table that contains the category and the the two measures that you want so that this drops out any unrelated rows.

 

 

 

 


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Mastering Power BI 2nd Edition

DAX is easy, CALCULATE makes DAX hard...
3 REPLIES 3
RT
Frequent Visitor

So is it safe to say that the quick measure correlation coefficient can't handle null/blank values but can handle using measures, and Daniil's version can handle nulls/blanks, but requires a table[column] and can't handle a measure?  Certainly the DAX is different between the two.  I have survey data that has blanks which I need to handle, but was also hoping to be able to use measures so that I could create some disconnected slicers tables to allow people to choose two survey metrics to then see if their correlation coefficient. 

I guess I would need clarification on what you mean by handling or not handling slicers. In their raw form, I'm not sure that either will necessarily handle measures but I'd need to better understand what you mean by that. I would think that the original formula would be able to handle measures since it is essentially creating a temp table. But, again, that depends on how you are using measures. If you need to handle null values, the stock Quick Measure as coded will not work, you will get incorrect results.


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Mastering Power BI 2nd Edition

DAX is easy, CALCULATE makes DAX hard...
Greg_Deckler
Super User
Super User

I would suggest perhaps this:

 

{X MEASURE} and {Y MEASURE} correlation for {CATEGORY} = 
VAR __CORRELATION_TABLE = FILTER(ADDCOLUMNS(VALUES({CATEGORY}),"__X",CALCULATE({X MEASURE})),"__Y",CALCULATE({Y MEASURE}))), AND (
            NOT ( ISBLANK ( [__X] ) ),
            NOT ( ISBLANK ( [__Y] ) )
        ))
VAR __COUNT =
	COUNTX(
		KEEPFILTERS(__CORRELATION_TABLE),
		[__X]
	)
VAR __SUM_X =
	SUMX(
		KEEPFILTERS(__CORRELATION_TABLE),
		[__X]
	)
VAR __SUM_Y = SUMX(KEEPFILTERS(__CORRELATION_TABLE), [__Y])
VAR __SUM_XY =
	SUMX(
		KEEPFILTERS(__CORRELATION_TABLE),
		[__X] * [__Y] * 1.
	)
VAR __SUM_X2 =
	SUMX(
		KEEPFILTERS(__CORRELATION_TABLE),
		[__X] ^ 2
	)
VAR __SUM_Y2 =
	SUMX(
		KEEPFILTERS(__CORRELATION_TABLE),
		[__Y] ^ 2
	)
RETURN
	DIVIDE(
		__COUNT * __SUM_XY - __SUM_X * __SUM_Y * 1.,
		SQRT(
			(__COUNT * __SUM_X2 - __SUM_X ^ 2)
				* (__COUNT * __SUM_Y2 - __SUM_Y ^ 2)
		)
	)

@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Mastering Power BI 2nd Edition

DAX is easy, CALCULATE makes DAX hard...

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.