Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Grow your Fabric skills and prepare for the DP-600 certification exam by completing the latest Microsoft Fabric challenge.

Reply
Revolution
Frequent Visitor

Calculate TRIMMEAN of last 12 months from raw sales data

I have a table containing quantity of each stock item sold per invoice. This includes date and I have a related Date table. I need to sum the quantity sold for each stock item monthly for the last 12 months then calculate the TRIMMEAN of this monthly data where high and low months are excluded.

 

Do I need to create a new table for the rolling 12 months where each column is a month then create a measure for the TRIMMEAN? If so, how do I create this table?

 

The issue is that I want to exclude outlying months and I can't find an example that sums the sales by date and then calculates the trimmed mean. I am also very new to this so may be misunderstanding what I've seen. 

 

If there is a better way please let me know. Thanks. 

1 ACCEPTED SOLUTION
Sahir_Maharaj
Super User
Super User

Hello @Revolution,

 

Can you please try the following:

 

1. Create a Measure for Monthly Sales

Monthly Sales = 
SUMX(
    VALUES('Date'[YearMonth]),
    CALCULATE(SUM('Sales'[Quantity]))
)

2. Create a Rolling 12-Month Sales Measure

Rolling 12-Month Sales = 
CALCULATE(
    [Monthly Sales],
    DATESBETWEEN(
        'Date'[Date],
        NEXTDAY(SAMEPERIODLASTYEAR(LASTDATE('Date'[Date]))),
        LASTDATE('Date'[Date])
    )
)

3. Rank Monthly Sales

Sales Rank = 
RANKX(
    ALL('Date'[YearMonth]), 
    [Monthly Sales], 
    , 
    ASC, 
    Dense
)

4. Exclude Outliers and Calculate the Mean

Trimmed Mean Sales = 
AVERAGEX(
    FILTER(
        ADDCOLUMNS(
            VALUES('Date'[YearMonth]),
            "Sales", [Monthly Sales],
            "Rank", [Sales Rank]
        ),
        [Rank] > 1 && [Rank] < 12
    ),
    [Sales]
)

Did I answer your question? Mark my post as a solution, this will help others!

If my response(s) assisted you in any way, don't forget to drop me a "Kudos" 🙂

Kind Regards,
Sahir Maharaj
Data Scientist | Data Engineer | Data Analyst | AI Engineer
P.S. Want me to build your Power BI solution?
➤ Lets connect on LinkedIn: Join my network of 13K+ professionals
➤ Join my free newsletter: Data Driven: From 0 to 100
➤ Website: https://sahirmaharaj.com
➤ Email: sahir@sahirmaharaj.com
➤ Want me to build your Power BI solution? Lets chat about how I can assist!
➤ Join my Medium community of 30K readers! Sharing my knowledge about data science and artificial intelligence
➤ Explore my latest project (350K+ views): Wordlit.net
➤ 100+ FREE Power BI Themes: Download Now
LinkedIn Top Voice in Artificial Intelligence, Data Science and Machine Learning

View solution in original post

5 REPLIES 5
StrategicSavvy
Resolver II
Resolver II

Hi @Revolution  ,

 

It looks to me as though as code 2132 has only one point of data within dataset which would make sense in this case. Can you assert this ? You can quickly bring explicit measures and instead of Sum Quantity sold aggregarte count or distinct count. 

it is exemplified in this row as in the filter context there is only one value (100) to evaluate:

StrategicSavvy_1-1710016266024.png

 

 I hope this post helps to solve your issue and if it does consider accepting it as a solution and giving the post a thumbs up!

My LinkedIn: www.linkedin.com/in/lukasz-kozdron 

Sahir_Maharaj
Super User
Super User

Hello @Revolution,

 

Can you please try the following:

 

1. Create a Measure for Monthly Sales

Monthly Sales = 
SUMX(
    VALUES('Date'[YearMonth]),
    CALCULATE(SUM('Sales'[Quantity]))
)

2. Create a Rolling 12-Month Sales Measure

Rolling 12-Month Sales = 
CALCULATE(
    [Monthly Sales],
    DATESBETWEEN(
        'Date'[Date],
        NEXTDAY(SAMEPERIODLASTYEAR(LASTDATE('Date'[Date]))),
        LASTDATE('Date'[Date])
    )
)

3. Rank Monthly Sales

Sales Rank = 
RANKX(
    ALL('Date'[YearMonth]), 
    [Monthly Sales], 
    , 
    ASC, 
    Dense
)

4. Exclude Outliers and Calculate the Mean

Trimmed Mean Sales = 
AVERAGEX(
    FILTER(
        ADDCOLUMNS(
            VALUES('Date'[YearMonth]),
            "Sales", [Monthly Sales],
            "Rank", [Sales Rank]
        ),
        [Rank] > 1 && [Rank] < 12
    ),
    [Sales]
)

Did I answer your question? Mark my post as a solution, this will help others!

If my response(s) assisted you in any way, don't forget to drop me a "Kudos" 🙂

Kind Regards,
Sahir Maharaj
Data Scientist | Data Engineer | Data Analyst | AI Engineer
P.S. Want me to build your Power BI solution?
➤ Lets connect on LinkedIn: Join my network of 13K+ professionals
➤ Join my free newsletter: Data Driven: From 0 to 100
➤ Website: https://sahirmaharaj.com
➤ Email: sahir@sahirmaharaj.com
➤ Want me to build your Power BI solution? Lets chat about how I can assist!
➤ Join my Medium community of 30K readers! Sharing my knowledge about data science and artificial intelligence
➤ Explore my latest project (350K+ views): Wordlit.net
➤ 100+ FREE Power BI Themes: Download Now
LinkedIn Top Voice in Artificial Intelligence, Data Science and Machine Learning
StrategicSavvy
Resolver II
Resolver II

Hi Revolution, 

It is really interesting problem and I like the idea of measuring such a metric. It give quite good insight statitical measure.  I think you can solve it with measures , so basically first you need to create your measures for Trim Mean and Trim Mean Rolling 12m.

Here is example code for that:

 

 

 

TrimMean =
VAR TrimPercent = 0.1 -- Replace with the desired percentage (between 0 and 1)
VAR Counts =
    SELECTCOLUMNS (
        VALUES ( 'Table'[Quantity Sold] ),
        "Data Point", 'Table'[Quantity Sold],
        "Count", CALCULATE ( COUNTROWS ( 'Table' ) )
    )
VAR NumberOfDataPoints = SUMX ( Counts, [Count] )
VAR StartAt = INT ( NumberOfDataPoints * TrimPercent / 2 )
VAR FinishAt = NumberOfDataPoints - StartAt
VAR RunningCounts =
    ADDCOLUMNS (
        Counts,
        "RunningCount",
        VAR ThisDataPoint = [Data Point]
        RETURN SUMX ( FILTER ( Counts, [Data Point] <= ThisDataPoint ), [Count] )
    )
VAR TrimmedCounts =
    ADDCOLUMNS (
        RunningCounts,
        "Trimmed Count",
        VAR ThisDataPoint = [Data Point]
        VAR MinRunningCount = MINX ( FILTER ( RunningCounts, [RunningCount] >= StartAt ), [RunningCount] )
        VAR MaxRunningCount = MAXX ( FILTER ( RunningCounts, [RunningCount] <= FinishAt ), [RunningCount] )
        VAR TrimmedTop = MAX ( [RunningCount] - StartAt, 0 )
        VAR TrimmedBottom = MAX ( [Count] - MAX ( [RunningCount] - FinishAt, 0 ), 0 )
        RETURN SWITCH (
            TRUE,
            [RunningCount] <= MinRunningCount, TrimmedTop,
            [RunningCount] > MaxRunningCount, TrimmedBottom,
            [Count]
        )
    )
VAR Numerator = SUMX ( TrimmedCounts, [Data Point] * [Trimmed Count] )
VAR Denominator = SUMX ( TrimmedCounts, [Trimmed Count] )
VAR TrimmedMean = DIVIDE ( Numerator, Denominator )
RETURN 
IF ( OR ( TrimPercent < 0, TrimPercent >= 1 ), BLANK(), TrimmedMean )


Trim Mean Rolling 12m = 
VAR CurrentDate = MAX('Calendar'[Date])
VAR NumberOfPeriods = -12
RETURN
CALCULATE(
    [TrimMean],
    DATESINPERIOD('Calendar'[Date],CurrentDate,NumberOfPeriods,MONTH)
)

 

 

 
Here you can compare both formulas in PowerBI and excel
StrategicSavvy_0-1710003569508.png

 

StrategicSavvy_1-1710003585487.png

I hope this post helps to solve your issue and if it does consider accepting it as a solution and giving the post a thumbs up!

My LinkedIn: www.linkedin.com/in/lukasz-kozdron 

Hi @StrategicSavvy thanks again for your help. Just to tie up loose ends I thought I'd respond again as I wasn't able to get your solution to produce my expected results.

 

We spoke about code 2132 and you were correct in saying it only had one data point. I was, however, attempting to calculate the trimmed mean of 12 months. So 12 data points are necessary for each code as no sales in any particular month means we most likely didn't have stock and these are exactly what I'm trying to trim out of the mean. In fact I'd like to take it further and only exclude data points that fall outside of a threshold determined by the trimmed mean. 

 

I was able to get @Sahir_Maharaj answer working so I didn't persue your solution any further. 

Hi @StrategicSavvy thank you for your response. I implemented your solution but the values returned for Trim Mean Rolling 12m were not correct. I can't work out exactly how the calculation has gone wrong. I suspect it is due to the source data table which is very raw data. There is a row for the quantity sold of each stock item for each invoice. So there are multiple rows for each stock item for any given date. 

 

See below example for stock code 2132 where total sales is equal to the Trim Mean Rolling 12m which, of course, can not be correct.

 

Revolution_0-1710010136152.jpeg

 

Please let me know what else I can provide to assist you in understanding my data set.

Helpful resources

Announcements
RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

MayPowerBICarousel1

Power BI Monthly Update - May 2024

Check out the May 2024 Power BI update to learn about new features.

Top Kudoed Authors