cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
ripstaur
Helper III
Helper III

Calculating daily differences

I have a data set that contains columns for counties and states in the US, and COVID-19 case counts and deaths in each county each day. The numbers are cumulative...if there was one case on day one, and no new cases on day two, there will be a 1 for day one and a 1 for day two. If on day three there is an additional case, then the number will change to 2. I am trying to find an expression in either M or Dax (if I can't do it in Power Query) that will let me calculate the daily change in case numbers. So, for instance in the above example, I would have had a 1 day 1, a 0 for day two, and a 1 for day three. This would give me the total of new cases reported each day. 

I realize that some sort of moving sum taking the difference between yesterday and today would work. One problem, though, is that I need the expression to let me also group by county as well as date, so I get the number of new cases in each county for each day. I would also have to put in some sort of "if the result is less than 0, then return 0" feature (because sometimes the counties mess up and report fewer cases instead of the cumulative number)...probably better to set it to return a truncated 7-day average, so I don't end up with spurious zeros in my time series. I have tried using the "Group by" option in Power Query, but it doesn't have a "moving" calculation. Anyone have any ideas? 

1 ACCEPTED SOLUTION

Hi @ripstaur ,

First create an index column;

Then create a column as below:

Daily Cases =
VAR _maxvalue =
    CALCULATE (
        MAX ( 'Table'[Cases] ),
        FILTER (
            'Table',
            'Table'[County] = EARLIER ( 'Table'[County] )
                && 'Table'[Index] < EARLIER ( 'Table'[Index] )
        )
    )
RETURN
    'Table'[Cases] - _maxvalue

And you will see:

vkellymsft_0-1629091502084.png

For the related .pbix file,pls see attached.

 

Best Regards,
Kelly

Did I answer your question? Mark my reply as a solution!

View solution in original post

10 REPLIES 10
Jos_Woolley
Solution Supplier
Solution Supplier

Power Query only alternative:

let
  Source = YourSource, 
  #"Added Index" = Table.AddIndexColumn(Source, "Index", 0, 1, Int64.Type), 
  #"Added Custom" = Table.AddColumn(
    #"Added Index", 
    "Custom", 
    each 
      let
        Previous = try #"Added Index"[Cases]{[Index] - 1} otherwise 0, 
        Result   = if Previous > [Cases] then 0 else [Cases] - Previous
      in
        Result
  )
in
  #"Added Custom"

which adds a 0-based Index column and then a Custom column which gives your required results. Amend the line Source = YourSource accordingly to fit in with your previous steps.

Regards

Thanks, Jos,

 

I tried this, copying your code into the Advanced Editor, and I ended up with this:

 

ripstaur_0-1629120823464.png

 

If I expand all the columns in the "Data" column, I get the following:

 

ripstaur_1-1629120935282.png

 

Not sure where I'm going wrong....

 

 

Oh...I didn't get it all. Beyond Data.Column6 I have: 

ripstaur_2-1629121190873.png

It's probably worth noting that the index column contains nothing but zeros, and the Custom column is all Errors. 

 

Another note: My original query included steps to promote the first column as headers, the "change type" commands that always follow promoting headers, and a command to remove column 5 (a blank column in my source). When I added your code after those lines, I got exactly the same result as I did when I just wrote your code in after the source lines. 

"Another note: My original query included steps to promote the first column as headers, the "change type" commands that always follow promoting headers, and a command to remove column 5 (a blank column in my source). When I added your code after those lines, I got exactly the same result as I did when I just wrote your code in after the source lines."

Yes, you should keep your original steps in. And when you added my code in, did you make sure that the (previous) step being referenced in my first line was the last of your initial steps (and not the first)?

Regards

ripstaur
Helper III
Helper III

Thank you , Edhans. This problem is like calculating moving ranges, but I need to be able to do it per county across the date range. I put together a sample (with an extra column showing the desired result of the calculation. When shifting from one county to another, the date is going to drop back to the starting date for the time period. In the result cell for that record, the value should equal the "cases" value for that day.

StateCountyDateCases Desired Result (Daily Cases)
AlabamaAbbot1/1/20201 1
AlabamaAbbot1/2/20201 0
AlabamaAbbot1/3/20202 1
AlabamaAbbot1/4/20202 0
AlabamaAbbot1/5/20203 1
AlabamaAbbot1/6/20203 0
AlabamaAbbot1/7/20203 0
AlabamaAbbot1/8/20204 1
AlabamaAbbot1/9/20204 0
AlabamaAbbot1/10/20205 1
AlabamaBillings1/1/20200 0
AlabamaBillings1/2/20200 0
AlabamaBillings1/3/20202 2
AlabamaBillings1/4/20202 0
AlabamaBillings1/5/20203 1
AlabamaBillings1/6/20204 1
AlabamaBillings1/7/20204 0
AlabamaBillings1/8/20204 0
AlabamaBillings1/9/20205 1
AlabamaBillings1/10/20205 0
DelawareDeForbes1/1/20200 0
DelawareDeForbes1/2/20201 1
DelawareDeForbes1/3/20201 0
DelawareDeForbes1/4/20201 0
DelawareDeForbes1/5/20203 2
DelawareDeForbes1/6/20203 0
DelawareDeForbes1/7/20205 2
DelawareDeForbes1/8/20207 2
DelawareDeForbes1/9/20207 0
DelawareDeForbes1/10/20207 0

Hi @ripstaur ,

First create an index column;

Then create a column as below:

Daily Cases =
VAR _maxvalue =
    CALCULATE (
        MAX ( 'Table'[Cases] ),
        FILTER (
            'Table',
            'Table'[County] = EARLIER ( 'Table'[County] )
                && 'Table'[Index] < EARLIER ( 'Table'[Index] )
        )
    )
RETURN
    'Table'[Cases] - _maxvalue

And you will see:

vkellymsft_0-1629091502084.png

For the related .pbix file,pls see attached.

 

Best Regards,
Kelly

Did I answer your question? Mark my reply as a solution!

Thanks, Kelly! This worked like a charm. I'm still trying to make a PowerQuery solution work, but this DAX solution is absolutely one great solution. My question now becomes - which is a more efficient solution--the DAX modeling or a PowerQuery one, if we can figure it out? I am going to replicate this for the number of deaths per county as well. Since there are about 3400 counties in the US, and the daily report also include extras (e.g., US territory counts, counts of cases that were reported without a county affiliation), I add 3400+ new records every day , with a State column, a county column, a lat and long column, a FIPS column, cumulative cases, cumulative deaths, daily cases and daily deaths. So would it be more efficient to do this in modeling, or querying? 

 

Best regards,

 

Rip

samdthompson
Super User
Super User

Hello, this would be btter tackled with DAX but you can do it in power query. 

 

1. Sort the data by say country, region and date.

2. add index starting at 0

3. duplicate table and alter the index to starting at 1

4. merge tables together with a full outer join using country and index as joins

5. you will now have the cumulative column and the cumulative column offset by one day in the same row. Subtract one from the other and you have your daily movement. Obviously for the first and last value for each country/region you will need to add some logic into the subtraction

 

 

// if this is a solution please mark as such. Kudos always appreciated.

Thanks, samdthompson! This is a good hint...if I can make it work, I will post the code as a solution. 

edhans
Super User
Super User

How to get good help fast. Help us help you.

How To Ask A Technical Question If you Really Want An Answer

How to Get Your Question Answered Quickly - Give us a good and concise explanation
How to provide sample data in the Power BI Forum - Provide data in a table format per the link, or share an Excel/CSV file via OneDrive, Dropbox, etc.. Provide expected output using a screenshot of Excel or other image. Do not provide a screenshot of the source data. I cannot paste an image into Power BI tables.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

Helpful resources

Announcements
2022 Release Wave 1 760x460.png

2022 Release Wave 1 Plan

Power Platform release plan for the 2022 release wave 1 describes all new features releasing from April 2022 through September 2022.

Power BI December 2021 Update_carousel 768x460.jpg

Check it Out!

Click here to read more about the December 2021 Updates!

Top Solution Authors