cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
ehahn Frequent Visitor
Frequent Visitor

AVERAGEX -- EXCLUDING OUTLIERS?

I have a custom measure [% Change] which is calculating the month-over-month percent change of value over time. This measure is included in a matrix to display [% Change] for multiple accounts.

 

I have another custom measure [Average Return] which is calculating the average of [% Change] for each date period as defined by the rows of the matrix. 

 

everything is working, HOWEVER... I now want to EXCLUDE outliers from the [Average Return] calculation. 

 

currently Average Return = AVERAGEX(values(VeoBalHistory[Date]),[% Change])

 

any ideas on how to calculate the above but exclude outliers (for example the -47.62% displayed below)?

2017-10-04_12-30-58.png

1 ACCEPTED SOLUTION

Accepted Solutions
OwenAuger Super Contributor
Super Contributor

Re: AVERAGEX -- EXCLUDING OUTLIERS?

@ehahn

I would recommend a measure like this:

 

Average Return Excluding Outliers =
// Use a method of your choosing to define thresholds for outliers
// This example uses:

// Lower Quartile - 1.5 * Interquartile range &
// Upper Quartile + 1.5 * Interquartile range VAR LowerQuartile = PERCENTILEX.INC ( VALUES ( VeoBalHistory[Date] ), [% Change], .25 ) VAR UpperQuartile = PERCENTILEX.INC ( VALUES ( VeoBalHistory[Date] ), [% Change], .75 ) VAR InterQuartileRange = UpperQuartile - LowerQuartile // Anything < OutlierThresholdLower is an outlier VAR OutlierThresholdLower = LowerQuartile - InterQuartileRange * 1.5 // Anything > than OutlierThresholdUpper is an outlier VAR OutlierThresholdUpper = UpperQuartile + InterQuartileRange * 1.5
// Calculate average over [% Change] only for dates when [% Change] is not an outlier
// For dates where [% Change] is an outlier, the IF function returns BLANK()

// which is not included in the average calculated by AVERAGEX RETURN AVERAGEX ( VALUES ( VeoBalHistory[Date] ), VAR ChangePct = [% Change] RETURN IF ( AND ( ChangePct >= OutlierThresholdLower, ChangePct <= OutlierThresholdUpper ), ChangePct ) )

 



Did I answer your question? Mark my post as a solution!

Proud to be a Datanaut!




View solution in original post

11 REPLIES 11
Super User
Super User

Re: AVERAGEX -- EXCLUDING OUTLIERS?

Can you wrap your AVERAGEX in an CALCULATE and filter out outliers that way?


I have book! Learn Power BI from Packt


Did I answer your question? Mark my post as a solution!

Proud to be a Datanaut!

ehahn Frequent Visitor
Frequent Visitor

Re: AVERAGEX -- EXCLUDING OUTLIERS?

new to DAX, so not sure how to do that.

ehahn Frequent Visitor
Frequent Visitor

Re: AVERAGEX -- EXCLUDING OUTLIERS?

and keep in mind, my goal is to exclude the outlier values from the average calculation.

OwenAuger Super Contributor
Super Contributor

Re: AVERAGEX -- EXCLUDING OUTLIERS?

@ehahn

I would recommend a measure like this:

 

Average Return Excluding Outliers =
// Use a method of your choosing to define thresholds for outliers
// This example uses:

// Lower Quartile - 1.5 * Interquartile range &
// Upper Quartile + 1.5 * Interquartile range VAR LowerQuartile = PERCENTILEX.INC ( VALUES ( VeoBalHistory[Date] ), [% Change], .25 ) VAR UpperQuartile = PERCENTILEX.INC ( VALUES ( VeoBalHistory[Date] ), [% Change], .75 ) VAR InterQuartileRange = UpperQuartile - LowerQuartile // Anything < OutlierThresholdLower is an outlier VAR OutlierThresholdLower = LowerQuartile - InterQuartileRange * 1.5 // Anything > than OutlierThresholdUpper is an outlier VAR OutlierThresholdUpper = UpperQuartile + InterQuartileRange * 1.5
// Calculate average over [% Change] only for dates when [% Change] is not an outlier
// For dates where [% Change] is an outlier, the IF function returns BLANK()

// which is not included in the average calculated by AVERAGEX RETURN AVERAGEX ( VALUES ( VeoBalHistory[Date] ), VAR ChangePct = [% Change] RETURN IF ( AND ( ChangePct >= OutlierThresholdLower, ChangePct <= OutlierThresholdUpper ), ChangePct ) )

 



Did I answer your question? Mark my post as a solution!

Proud to be a Datanaut!




View solution in original post

ZaBaLaaa Frequent Visitor
Frequent Visitor

Re: AVERAGEX -- EXCLUDING OUTLIERS?

Great solution . 

I don't understand why there isn't a trimmean() in Excel, this function would be very usefull . 

powerBIpeon Regular Visitor
Regular Visitor

Re: AVERAGEX -- EXCLUDING OUTLIERS?

@OwenAuger

 

Would I be able to use the same measure if I wanted to just trim out the top and bottom 20% of values? 

OwenAuger Super Contributor
Super Contributor

Re: AVERAGEX -- EXCLUDING OUTLIERS?

@powerBIpeon

Yes, the structure of the measure would be similar. You would just redefine OutlierThresholdLower and OutlierThresholdUpper to be the 20th and 80th percentiles instead, with no need for the interquartile range.

 

Please post back if you need more assistance with writing the measure in your particular model.

 

Regards,

Owen



Did I answer your question? Mark my post as a solution!

Proud to be a Datanaut!




powerBIpeon Regular Visitor
Regular Visitor

Re: AVERAGEX -- EXCLUDING OUTLIERS?

@OwenAuger

 

I'm using this measure below. Is this correct? I'm getting a slighlty different number when using the TRIMMEAN function in excel

 

Average Return Excluding Outliers = 

VAR LowerQuartile =
    PERCENTILEX.INC ( VALUES ( Account[Acct. Name] ), [Difference], .20 )
VAR UpperQuartile =
    PERCENTILEX.INC ( VALUES ( Account[Acct. Name] ),[Difference], .80 )

// Anything < OutlierThresholdLower is an outlier
VAR OutlierThresholdLower = LowerQuartile
 
// Anything > than OutlierThresholdUpper is an outlier
VAR OutlierThresholdUpper = UpperQuartile
    
// Calculate average over [% Change] only for dates when [% Change] is not an outlier
// For dates where [% Change] is an outlier, the IF function returns BLANK()
// which is not included in the average calculated by AVERAGEX
RETURN
    AVERAGEX( 
        VALUES ( Account[Acct. Name] ),
        VAR ChangePct = [Difference]
        RETURN
            IF (
                AND ( ChangePct >= OutlierThresholdLower, ChangePct <= OutlierThresholdUpper ),
                ChangePct
            )
    )
    
OwenAuger Super Contributor
Super Contributor

Re: AVERAGEX -- EXCLUDING OUTLIERS?

@powerBIpeon

To exactly replicate the behaviour of TRIMMEAN requires a slightly more careful measure.

 

TRIMMEAN in Excel trims an equal number of top & bottom items, and breaks ties arbitrarily, so that it is possible for items with the same value to be partially trimmed (which my earlier measure didn't do). Also, the percentage provided is multiplied by the number of items, then rounded down to the nearest even number.

 

Here is a measure that should replicate TRIMMEAN's behaviour. In this example, TrimPercent is set to 0.4, so 40% of items are trimmed, i.e. 20% at the top + 20% at the bottom.

I won't vouch for performance - this can probably be optimised! 🙂

Average Return Excluding Outliers v2 = 
VAR TrimPercent = 0.4 -- Same as the second argument of TRIMMEAN
VAR Items =
    VALUES ( Account[Acct. Name] )
VAR ItemCount =
    COUNTROWS ( Items )
VAR TrimCount =
    FLOOR ( TrimPercent * ItemCount, 2 ) / 2 -- Count of items to be trimmed at top (& bottom)
// ItemsToTrim is the union of the top & bottom items.
// If two items have the same [Difference value], then ties are broken
// using the ordering of Acct. Name VAR ItemsToTrim = UNION ( TOPN ( TrimCount, Items, RANKX ( Items, [Difference] + DIVIDE ( RANKX ( Items, Account[Acct. Name] ), ItemCount + 1 ) ), ASC ), TOPN ( TrimCount, Items, RANKX ( Items, [Difference] + DIVIDE ( RANKX ( Items, Account[Acct. Name] ), ItemCount + 1 ) ), DESC ) ) VAR ItemsToKeep = EXCEPT ( Items, ItemsToTrim ) RETURN AVERAGEX ( ItemsToKeep, [Difference] )

Regards,

Owen



Did I answer your question? Mark my post as a solution!

Proud to be a Datanaut!




Helpful resources

Announcements
Ask Amir Anything

Exclusive LIVE Community Event No. 2 – Ask Amir Anything

Next in our Triple A series: Ask Amir Netz questions about the latest updates, features and future.

Ask Amanda Anything Q&A

Ask Amanda Anything Q&A

Learn the answers to some of the questions asked during the Amanda Triple A event.

October 2019 Community Highlights

October 2019 Community Highlights

October was a busy month in the community. Read the recap article to learn about some of the events and content.

New Solution Badges

New Solution Badges

Two waves of brand new solution badges are coming! Read the article for more information on our new community badges.

Users Online
Currently online: 323 members 3,366 guests
Please welcome our newest community members: