Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
epicleo
Frequent Visitor

Including 0 values when calculating STDEV

Hello Everyone, 

 

I am attempting to calculate the 26 Week STDEV for weekly sales.  The Sales records are daily so I am using a Date table to aggregate them by week.

 

The problem I am coming into is when one of the weeks total sales for a specific product is 0, then the STDEV formula below ignored that period and the 0 for that period, which skews the standard deviation.  

How can I get the STDEV to conisder the periods where sales are zero?


Thanks you!!

 

 

 

 

26-W.STDEV = CALCULATE(
STDEVX.P(
VALUES(DimDate[Calendar Week Number]),
CALCULATE(SUM(iContractsChargebacks[ChargebackUnits]))
),
FILTER(ALL(DimDate),
DimDate[Week ID]<=MAX(DimDate[Week ID])-1 &&
DimDate[Week ID]>=MAX(DimDate[Week ID])-26
))

 

10 REPLIES 10
Bhanu_VA
Helper II
Helper II

Hi @epicleo ,

Did you find any solution for it?

Facing exactly the same issue.

 

Thanks

v-huizhn-msft
Employee
Employee

Hi @epicleo,

As the @GilbertQ said, if all they are null where sales are zero, you can change it by calculating a calculated column using the formula.

New Column=IF(ISBALNK(Table[column]),0,Table[column])

Then use the new column in your 26-W.STDEV formula, and check if it works fine.

Best Regards,
Angelia


Sorry for the delay.  There is actually no value.  Let me walk you through a better example.  I am attempting to determine the most recent 6-Month STDEV for sales by product in order to determine variation.  

 

The Sales Datais made up of daily transactions so it contains a   iContractsChargebacks[TransactionDate] and iContractsChargebacks[ChargebackUnits] (i.e. sales), but if there are no sales in a given month, then there will be no data for that month. 

 

So, for example, on July 1st, sales for the past 6 months were the following:

 

Jan 100

Feb 125
Apr 140
May 125

Jun 130

 

March is missing because there were no sales.  So, when I calculate STDEV on the data set, it is calculating it over 5 periods, when in fact there were 6, just one happend to be zero.

 

The code I am curretly using is:

 

6M STDEV = CALCULATE(
STDEVX.P(
VALUES(DimDate[Calendar Month Number]),
CALCULATE(SUM(iContractsChargebacks[ChargebackUnits]))
),
FILTER(ALL(DimDate),
DimDate[Month ID]<=MAX(DimDate[Month ID])-1 &&
DimDate[Month ID]>=MAX(DimDate[Month ID])-6
))

 

Instead of using date parameters in the code, I created a calculated column in the date table that gives each Month a unique ID, makes it easier for me.

 

What I need the formula to do is calculate STDEV across the six month period.  If when pulling the monthly sales numbers, it only comes back with 3 periods, then it needs to assume the other 2 periods with a zero value.

 

Thank you for your help with this!!

Hi @epicleo


What I would suggest doing is rather than to try and figure out when there is no data I would solve it by doing the following.

 

I would left join from your DimDate table, to your Fact (Source Data) table. By doing a left join from the DimDate table it will then bring in all the dates. This will then allow your data to be blank when there is no data.

 

Then based on your requirements you can then use the same measures and not it should show zero for the months where there is no data when you put in your Date column from your DimDate table.





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!







Power BI Blog

@GilbertQ Thank you for the help; that would be a great work around except that I m dealing with more than 1000 SKUs so my query would produce a few million rows given we transact on a daily basis and are reporting on 3 to 5 year period.

 

I thought about manaully calculating standard deviation instead of using the DAX STDEVX.P formula and found these 2 as a reference on how to do so, the first being closest to my need:

 

https://community.powerbi.com/t5/Desktop/Problem-with-STDEV/td-p/19731

 

 

https://stackoverflow.com/questions/42605377/calculating-the-standard-deviation-from-columns-of-valu...

 

So, given my detail above, I attempted to make a go of it, but still am not getting the correct number.  My code is:

 

STDEVX2 =
var Averageprice=[6M Sales]
var months=6
return
SQRT(
DIVIDE(SUMX(
FILTER(ALL(DimDate),
DimDate[Month ID]<=(MAX(DimDate[Month ID])-1) &&
DimDate[Month ID]>=(MAX(DimDate[Month ID])-6)
),
(iContractsChargebacks[SumOfOrderQuantity]-Averageprice)^2),
months
)
)

 

I am determined to figure this out and I appreciate all your help with this.  Thank you in advance!!!

I was facing a similar problem while trying to calculate the coefficient of variation (Std. /Mean) by SKUS from sales data. I could use the Pivot-Unpivot function in Power Query editor to to do away with the problem of months with missing sales:

1) Export the data with any calculated columns

2) Reimport the data so that the calculated columns are also available in the power query editor

3) Pivoted the data by months

4) Replaced null values with 0s

5) Unpivoted the data

6) Close and apply the query

7) Add a calculated column for the coefficient of variation using the formula CV = CALCULATE(STDEV.P(Table1[Value]),ALLEXCEPT(Table1,Table1[Product]))/CALCULATE(AVERAGE(Table1[Value]),ALLEXCEPT(Table1,Table1[Product]))

 

Thus zero sales for the missing months will also be considered both for Standard Deviation and Mean.

Hi, 

 

did you was able to find a solution? I'm facing with a similar probelm (I would calculate standard deviation on latest 52 weeks but I have some week with no data)...

What I would suggest doing is if you can just get a dataset for the months that you are testing, and by using the merge to see it can get to the number you are expecting to be the result?

 

This should then ensure that the theory is correct. Which is important to know that you can calculate the final number.

 

Then if the above works, if you had to bring in all the data, how large would the data model be? As there are always ways to optimize the data model. And I have worked with large datasets with a few million rows and it has still been very fast both when developing as well as once deployed to the Power BI Service.





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!







Power BI Blog

GilbertQ
Super User
Super User

Hi @epicleo

 

Are the values zero or are they NULL?

 

By default Power BI will not use any values that are NULL, but if they have any other value (including 0) it should then still consider it?





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!







Power BI Blog

@GilbertQ Sorry for the delay.  There is actually no value.  Let me walk you through a better example.  I am attempting to determine the most recent 6-Month STDEV for sales by product in order to determine variation.  

 

The Sales Data is made up of daily transactions, so it contains a   iContractsChargebacks[TransactionDate] and iContractsChargebacks[ChargebackUnits] (i.e. sales), but if there are no sales in a given month, then there will be no data for that month. 

 

So, for example, on July 1st, sales for the past 6 months were the following:

 

Jan 100

Feb 125
Apr 140
May 125

Jun 130

 

March is missing because there were no sales.  So, when I calculate STDEV on the data set, it is calculating it over 5 periods, when in fact there were 6, just one happend to be zero.

 

The code I am curretly using is:

 

6M STDEV = CALCULATE(
STDEVX.P(
VALUES(DimDate[Calendar Month Number]),
CALCULATE(SUM(iContractsChargebacks[ChargebackUnits]))
),
FILTER(ALL(DimDate),
DimDate[Month ID]<=MAX(DimDate[Month ID])-1 &&
DimDate[Month ID]>=MAX(DimDate[Month ID])-6
))

 

Instead of using date parameters in the code, I created a calculated column in the date table that gives each Month a unique ID, makes it easier for me.

 

What I need the formula to do is calculate STDEV across the six month period.  If when pulling the monthly sales numbers, it only comes back with 3 periods, then it needs to assume the other 2 periods with a zero value.

 

Thank you for your help with this!!

 

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.