cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Highlighted
Frequent Visitor

Including 0 values when calculating STDEV

Hello Everyone, 

 

I am attempting to calculate the 26 Week STDEV for weekly sales.  The Sales records are daily so I am using a Date table to aggregate them by week.

 

The problem I am coming into is when one of the weeks total sales for a specific product is 0, then the STDEV formula below ignored that period and the 0 for that period, which skews the standard deviation.  

How can I get the STDEV to conisder the periods where sales are zero?


Thanks you!!

 

 

 

 

26-W.STDEV = CALCULATE(
STDEVX.P(
VALUES(DimDate[Calendar Week Number]),
CALCULATE(SUM(iContractsChargebacks[ChargebackUnits]))
),
FILTER(ALL(DimDate),
DimDate[Week ID]<=MAX(DimDate[Week ID])-1 &&
DimDate[Week ID]>=MAX(DimDate[Week ID])-26
))

 

9 REPLIES 9
Highlighted
Super User IV
Super User IV

Re: Including 0 values when calculating STDEV

Hi @epicleo

 

Are the values zero or are they NULL?

 

By default Power BI will not use any values that are NULL, but if they have any other value (including 0) it should then still consider it?





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!







Power BI Blog

Highlighted
Microsoft
Microsoft

Re: Including 0 values when calculating STDEV

Hi @epicleo,

As the @GilbertQ said, if all they are null where sales are zero, you can change it by calculating a calculated column using the formula.

New Column=IF(ISBALNK(Table[column]),0,Table[column])

Then use the new column in your 26-W.STDEV formula, and check if it works fine.

Best Regards,
Angelia


Highlighted
Frequent Visitor

Re: Including 0 values when calculating STDEV

@GilbertQ Sorry for the delay.  There is actually no value.  Let me walk you through a better example.  I am attempting to determine the most recent 6-Month STDEV for sales by product in order to determine variation.  

 

The Sales Data is made up of daily transactions, so it contains a   iContractsChargebacks[TransactionDate] and iContractsChargebacks[ChargebackUnits] (i.e. sales), but if there are no sales in a given month, then there will be no data for that month. 

 

So, for example, on July 1st, sales for the past 6 months were the following:

 

Jan 100

Feb 125
Apr 140
May 125

Jun 130

 

March is missing because there were no sales.  So, when I calculate STDEV on the data set, it is calculating it over 5 periods, when in fact there were 6, just one happend to be zero.

 

The code I am curretly using is:

 

6M STDEV = CALCULATE(
STDEVX.P(
VALUES(DimDate[Calendar Month Number]),
CALCULATE(SUM(iContractsChargebacks[ChargebackUnits]))
),
FILTER(ALL(DimDate),
DimDate[Month ID]<=MAX(DimDate[Month ID])-1 &&
DimDate[Month ID]>=MAX(DimDate[Month ID])-6
))

 

Instead of using date parameters in the code, I created a calculated column in the date table that gives each Month a unique ID, makes it easier for me.

 

What I need the formula to do is calculate STDEV across the six month period.  If when pulling the monthly sales numbers, it only comes back with 3 periods, then it needs to assume the other 2 periods with a zero value.

 

Thank you for your help with this!!

 

Highlighted
Frequent Visitor

Re: Including 0 values when calculating STDEV

Sorry for the delay.  There is actually no value.  Let me walk you through a better example.  I am attempting to determine the most recent 6-Month STDEV for sales by product in order to determine variation.  

 

The Sales Datais made up of daily transactions so it contains a   iContractsChargebacks[TransactionDate] and iContractsChargebacks[ChargebackUnits] (i.e. sales), but if there are no sales in a given month, then there will be no data for that month. 

 

So, for example, on July 1st, sales for the past 6 months were the following:

 

Jan 100

Feb 125
Apr 140
May 125

Jun 130

 

March is missing because there were no sales.  So, when I calculate STDEV on the data set, it is calculating it over 5 periods, when in fact there were 6, just one happend to be zero.

 

The code I am curretly using is:

 

6M STDEV = CALCULATE(
STDEVX.P(
VALUES(DimDate[Calendar Month Number]),
CALCULATE(SUM(iContractsChargebacks[ChargebackUnits]))
),
FILTER(ALL(DimDate),
DimDate[Month ID]<=MAX(DimDate[Month ID])-1 &&
DimDate[Month ID]>=MAX(DimDate[Month ID])-6
))

 

Instead of using date parameters in the code, I created a calculated column in the date table that gives each Month a unique ID, makes it easier for me.

 

What I need the formula to do is calculate STDEV across the six month period.  If when pulling the monthly sales numbers, it only comes back with 3 periods, then it needs to assume the other 2 periods with a zero value.

 

Thank you for your help with this!!

Highlighted
Super User IV
Super User IV

Re: Including 0 values when calculating STDEV

Hi @epicleo


What I would suggest doing is rather than to try and figure out when there is no data I would solve it by doing the following.

 

I would left join from your DimDate table, to your Fact (Source Data) table. By doing a left join from the DimDate table it will then bring in all the dates. This will then allow your data to be blank when there is no data.

 

Then based on your requirements you can then use the same measures and not it should show zero for the months where there is no data when you put in your Date column from your DimDate table.





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!







Power BI Blog

Highlighted
Frequent Visitor

Re: Including 0 values when calculating STDEV

@GilbertQ Thank you for the help; that would be a great work around except that I m dealing with more than 1000 SKUs so my query would produce a few million rows given we transact on a daily basis and are reporting on 3 to 5 year period.

 

I thought about manaully calculating standard deviation instead of using the DAX STDEVX.P formula and found these 2 as a reference on how to do so, the first being closest to my need:

 

https://community.powerbi.com/t5/Desktop/Problem-with-STDEV/td-p/19731

 

 

https://stackoverflow.com/questions/42605377/calculating-the-standard-deviation-from-columns-of-valu...

 

So, given my detail above, I attempted to make a go of it, but still am not getting the correct number.  My code is:

 

STDEVX2 =
var Averageprice=[6M Sales]
var months=6
return
SQRT(
DIVIDE(SUMX(
FILTER(ALL(DimDate),
DimDate[Month ID]<=(MAX(DimDate[Month ID])-1) &&
DimDate[Month ID]>=(MAX(DimDate[Month ID])-6)
),
(iContractsChargebacks[SumOfOrderQuantity]-Averageprice)^2),
months
)
)

 

I am determined to figure this out and I appreciate all your help with this.  Thank you in advance!!!

Highlighted
Super User IV
Super User IV

Re: Including 0 values when calculating STDEV

What I would suggest doing is if you can just get a dataset for the months that you are testing, and by using the merge to see it can get to the number you are expecting to be the result?

 

This should then ensure that the theory is correct. Which is important to know that you can calculate the final number.

 

Then if the above works, if you had to bring in all the data, how large would the data model be? As there are always ways to optimize the data model. And I have worked with large datasets with a few million rows and it has still been very fast both when developing as well as once deployed to the Power BI Service.





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!







Power BI Blog

Highlighted
Frequent Visitor

Re: Including 0 values when calculating STDEV

Hi, 

 

did you was able to find a solution? I'm facing with a similar probelm (I would calculate standard deviation on latest 52 weeks but I have some week with no data)...

Highlighted
New Member

Re: Including 0 values when calculating STDEV

I was facing a similar problem while trying to calculate the coefficient of variation (Std. /Mean) by SKUS from sales data. I could use the Pivot-Unpivot function in Power Query editor to to do away with the problem of months with missing sales:

1) Export the data with any calculated columns

2) Reimport the data so that the calculated columns are also available in the power query editor

3) Pivoted the data by months

4) Replaced null values with 0s

5) Unpivoted the data

6) Close and apply the query

7) Add a calculated column for the coefficient of variation using the formula CV = CALCULATE(STDEV.P(Table1[Value]),ALLEXCEPT(Table1,Table1[Product]))/CALCULATE(AVERAGE(Table1[Value]),ALLEXCEPT(Table1,Table1[Product]))

 

Thus zero sales for the missing months will also be considered both for Standard Deviation and Mean.

Helpful resources

Announcements
August 2020 Community Challenge: Can You Solve These?

August 2020 Community Challenge: Can You Solve These?

We're excited to announce our first cross-community 'Can You Solve These?' challenge!

Community Blog

Community Blog

Visit our Community Blog for articles, guides, and information created by fellow community members.

Upcoming Events

Upcoming Events

Wondering what events you could join or have an event to promote yourself? Check out our Upcoming Events.

Get Ready for Power BI Dev Camp

Get Ready for Power BI Dev Camp

We are thrilled to announce we will begin running a monthly webinar series named Power BI Dev Camp.

Top Solution Authors
Top Kudoed Authors