Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
jeffreyweir
Helper III
Helper III

DAX formula to calculate cumulative running total across two columns that halves on non workdays

Okay, so this isn't strictly a PowerBI problem, but rather a DAX problem that I'm trying to solve in PowerPivot./Excel, although we will be migrating the workbook to PowerBI in due course.

 

So here's my problem:

I have some data (90,000 rows) I'm trying to use to calculate a cumulative "fatigue score" for folk working shifts...currently using PowerPivot/Excel 2016.

 

As per the below screenshot, the dataset is shift data for multiple employees, that has a cumulative count of days worked vs. days off that resets back to 1 whenever they switch from one state to the other, and a 'Score' column that in my production data contains a measure of how fatigued they are.

 

I would like to cumulatively sum that fatigue score, and halve the cumulative sum for each line that Workday is FALSE. . My desired output is in the 'Cumulative Total' column far right, and I've used grey highlighting to show days worked vs. days off as well as put a bold border around separate Emp_ID blocks to help demonstrate the data. 

 

Capture.PNG

 

This is very similar to a question I asked previously at StackOverflow, with the difference that I need the cumulative total to halve every time the Workday value is FALSE. Alejondro came up with a solution to that problem that looked like this:

Cumulative Score =
CALCULATE (
    SUM ( 'Shifts'[Score] ),
    FILTER (
        ALLSELECTED ( Shifts ),
        [Helper] = MAX ( [Helper] )
            && [EMP_ID] = MAX ( Shifts[EMP_ID] )
            && [Date] <= MAX ( Shifts[Date] )
    )
)

 

...but I need to amend this to halve the cumulative total each time the Workday value is FALSE.

 

While making these kinds of adjustments to cumulative totals is very simple in Excel, I'm not sure this is possible in DAX.  Currently we are in fact using PowerPivot to serve up the data in Excel, and then using Excel to create the cumulative logic, and then pulling the cumulative totals back into PowerPivot via a linked table, which works but is clunky.

 

 

1 ACCEPTED SOLUTION

Hi @jeffreyweir

 

@Phil_Seamark alerted me to this thread.

 

Yes, your Cumulative Total can be done in a DAX calculated column. (You could also consider Power Query).

 

Sample pbix here to illustrate the DAX.

 

A version of the column is:

Cumulative Total (new DAX) = 
VAR OuterDate = Shifts[Date]
RETURN
    SUMX (
        CALCULATETABLE (
            Shifts,
            ALLEXCEPT ( Shifts, Shifts[Emp_ID] ),
            Shifts[Date] <= OuterDate
        ),
        VAR InnerDate = Shifts[Date]
        RETURN
            POWER (
                0.5,
                CALCULATE (
                    COUNTROWS ( Shifts ),
                    ALLEXCEPT ( Shifts, Shifts[Emp_ID] ),
                    Shifts[Date] <= OuterDate,
                    Shifts[Date] >= InnerDate,
                    NOT ( Shifts[Workday] )
                )
            )
                * Shifts[Score]
    )

Out of interest, a version that works in "old DAX" without variables is:

Cumulative Total (old DAX) = 
SUMX (
    CALCULATETABLE (
        Shifts,
        ALLEXCEPT ( Shifts, Shifts[Emp_ID] ),
        Shifts[Date] <= EARLIER ( Shifts[Date] )
    ),
    POWER (
        0.5,
        CALCULATE (
            COUNTROWS ( Shifts ),
            ALLEXCEPT ( Shifts, Shifts[Emp_ID] ),
            Shifts[Date] <= EARLIER ( Shifts[Date], 2 ),
            Shifts[Date] >= EARLIER ( Shifts[Date] ),
            NOT ( Shifts[Workday] )
        )
    )
        * Shifts[Score]
)

The way I've replicated the recursive calculation is to

  1. Grab the rows of shifts for the current employee up to the current date.
  2. For each of those rows, count the number of non-work days from that row's date to the current date.
  3. Calculate 0.5^(row count from step 2) and multiply by each row's Score, then sum.

The reason this works is that if you imagine calculating the Cumulative Total for each row in turn, every time you hit another non-work day, each previous row gets multiplied by another factor of 0.5.

 

 

Cheers,

Owen

 


Owen Auger
Did I answer your question? Mark my post as a solution!
Blog
Twitter
LinkedIn

View solution in original post

25 REPLIES 25
jeffreyweir
Helper III
Helper III

This works almost perfectly in production. Almost, because it doesn't correctly handle the situation whereby an employee doesn

This works almost perfectly in production. Almost, because it doesn't correctly handle the situation when the first few days are not workdays. On the left is what I'm after, on the right is what I get.

 

That said, this is probably good enough for my needs. But it sure would be cool if the function could be amended to only start calculating from the first workday, so that those initial non workdays don't cause the first reading to be halved.

 

.Capture.PNG

Hi @jeffreyweir

Could you show the other columns alongside your extract above?

 

Was just trying to work out if you have a non-zero score sitting on those initial non-workdays.

 

Are you saying that if there is a score on the initial non-workdays, you only want it included in the Cumulative sum from the first workday onwards, and no halving to occur before the first workday?

 

Cheers,

Owen


Owen Auger
Did I answer your question? Mark my post as a solution!
Blog
Twitter
LinkedIn

Hi Owen. It was a zero score...it's always a zero score (i.e. no 'fatigue points') on a non workday. So

  • On workdays, staff accumulate 'fatigue points'
  • On non workdays, staff get no fatigue points, and their cumulative amount falls to half of what it was previously.

 

Other than that, your DAX works fine. But it must be REALLY computationally expensive, as I'm having to wait around 6 minutes sometimes when saving the file, or switching between Excel and the PowerPivot window, or making any kind of change in the PowerPivot window such as changing the name of a column or deleting it.

 

Previously I was doing the cumulative bit in Excel (using a PivotTable to get the raw data there so I could then do the calcs in a Table alongside), then sucking it back into the data model via a linked table. Which works fine, and I must say is much faster. But more complicated than an all DAX solution.

 

I'll try the PowerQuery solution, and see what imact that has on performance. I'll also try migrating the entire project from PowerPIvot to Analysis Services, and see what kind of performance I get. But I may yet have to keep doing the cumulative calcs in Excel. Which isn't a problem...it just requires an extra refresh, as I need to get refreshed data into Excel, and then once Excel creates the cumulative total I need to refresh the linked cumulative table so that the newly updated cumulative calcs get posted back to the data model.

 

I certainly appreciate the interest in this thread...will be a very good learning opportunity even if we end up with the status quo.

Strange: when I tested the 3 solutions, these included cases that started with a non working day and they all worked fine.

 

However, in my test data, all non working days had a score of 0 (just like in the examples in post #1).

In the case that a first non working day has a score, the Power Query solutions still work fine Smiley Tongue but the DAX solution starts with this score.

I guess that wouldn't be to hard to have that adjusted, if so required,

Specializing in Power Query Formula Language (M)
Phil_Seamark
Employee
Employee

Hi @jeffreyweir,

 

So what columns does your dataset already have, and which columns would you like a calcualted column/measure for?

 

And do you have a txt version of the data to make it easier to import, rather than a photo to help speed up the suggestions 🙂

 

 


To learn more about DAX visit : aka.ms/practicalDAX

Proud to be a Datanaut!

Whoops, sorry Phil...forgot to upload that file. Will upload to onedrive and post link as soon as I'm able (can't do it from work) but meanwhile have emailed you a copy in spreadsheet form.

 

In regards to the rest of your question, I'm not quite sure I follow. Using the sample data, the calculated column needs to match the sample results in the 'Cumulative Total' column, and it needs to reference the Emp_ID column, the Workday column, and the Score column.

 

 

Here's a link to the sample file:
https://1drv.ms/x/s!Ah_zTnaUo4Dzjnt6KWy8ZxWAEfPC

 

Hi @jeffreyweir

 

@Phil_Seamark alerted me to this thread.

 

Yes, your Cumulative Total can be done in a DAX calculated column. (You could also consider Power Query).

 

Sample pbix here to illustrate the DAX.

 

A version of the column is:

Cumulative Total (new DAX) = 
VAR OuterDate = Shifts[Date]
RETURN
    SUMX (
        CALCULATETABLE (
            Shifts,
            ALLEXCEPT ( Shifts, Shifts[Emp_ID] ),
            Shifts[Date] <= OuterDate
        ),
        VAR InnerDate = Shifts[Date]
        RETURN
            POWER (
                0.5,
                CALCULATE (
                    COUNTROWS ( Shifts ),
                    ALLEXCEPT ( Shifts, Shifts[Emp_ID] ),
                    Shifts[Date] <= OuterDate,
                    Shifts[Date] >= InnerDate,
                    NOT ( Shifts[Workday] )
                )
            )
                * Shifts[Score]
    )

Out of interest, a version that works in "old DAX" without variables is:

Cumulative Total (old DAX) = 
SUMX (
    CALCULATETABLE (
        Shifts,
        ALLEXCEPT ( Shifts, Shifts[Emp_ID] ),
        Shifts[Date] <= EARLIER ( Shifts[Date] )
    ),
    POWER (
        0.5,
        CALCULATE (
            COUNTROWS ( Shifts ),
            ALLEXCEPT ( Shifts, Shifts[Emp_ID] ),
            Shifts[Date] <= EARLIER ( Shifts[Date], 2 ),
            Shifts[Date] >= EARLIER ( Shifts[Date] ),
            NOT ( Shifts[Workday] )
        )
    )
        * Shifts[Score]
)

The way I've replicated the recursive calculation is to

  1. Grab the rows of shifts for the current employee up to the current date.
  2. For each of those rows, count the number of non-work days from that row's date to the current date.
  3. Calculate 0.5^(row count from step 2) and multiply by each row's Score, then sum.

The reason this works is that if you imagine calculating the Cumulative Total for each row in turn, every time you hit another non-work day, each previous row gets multiplied by another factor of 0.5.

 

 

Cheers,

Owen

 


Owen Auger
Did I answer your question? Mark my post as a solution!
Blog
Twitter
LinkedIn

Far out! When I half explained this issue to Phil the other day and he notchelently said "yeah, DAX can do that", well...to quote The Castle, I thought he was "dreamin".

 

This solution is going "straight to the Pool Room."

 

Now to step through it. I'm an Excel Formulas and VBA dude, so DAX is still pretty new to me. This has doubled my interest in DAX. Many, many thanks.

Sean
Community Champion
Community Champion

@OwenAuger  Great Job again! Smiley Happy

 

@Phil_Seamark  I was not responding to you because I was looking into this...

HALF RT on Non Workdays.png

Now have to process Owen's solution Smiley Happy

 

EDIT: @MarcelBeug ??? we are all waiting for the M solution Smiley Happy

I had a look at an M approach earlier using List.Accumulate but it didn't provide enough granularity over the iterations to change the operation.  I didn't look into creating a function.  I definitey think @MarcelBeug will be able to come up with something pretty cool.


To learn more about DAX visit : aka.ms/practicalDAX

Proud to be a Datanaut!

@Sean, @Phil_Seamark

Would this List.Generate based M Solution be cool enough?

 

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    Typed1 = Table.TransformColumnTypes(Source,{{"Emp_ID", Int64.Type}, {"Date", type date}, {"Cumulative Days", Int64.Type}, {"Workday", type logical}, {"Score", Int64.Type}, {"Cumulative Total", type number}}),
    Removed = Table.RemoveColumns(Typed1,{"Cumulative Total"}),
    Cumulation = List.Generate(() => [Index = 1,
                                     Employee = Removed[Emp_ID]{0},
                                     Total = if Removed[Workday]{0} then Removed[Score]{0} else 0],

                            each [Index] <= Table.RowCount(Removed),

                            each [Index = [Index] + 1,
                                  Employee = Removed[Emp_ID]{[Index]},
                                  Total = if Employee <> [Employee]
                                          then if Removed[Workday]{[Index]} 
                                               then Removed[Score]{[Index]}
                                               else 0
                                          else if Removed[Workday]{[Index]}
                                               then [Total] + Removed[Score]{[Index]}
                                               else [Total] * 0.5],

                            each [Total]),

    Records = Table.ToRecords(Removed),
    CombinedTable = Table.FromColumns({Records,Cumulation},{"Records","Cumulative Total"}),
    ExpandedRecords = Table.ExpandRecordColumn(CombinedTable, "Records", {"Emp_ID", "Date", "Cumulative Days", "Workday", "Score"}, {"Emp_ID", "Date", "Cumulative Days", "Workday", "Score"}),
    Typed2 = Table.TransformColumnTypes(ExpandedRecords,{{"Cumulative Total", type number}})
in
    Typed2
Specializing in Power Query Formula Language (M)

Most definitely @MarcelBeug.  I will learn plenty from that one 🙂


To learn more about DAX visit : aka.ms/practicalDAX

Proud to be a Datanaut!

@Phil_Seamark, @jeffreyweir, @Sean, @OwenAuger

 

Guys (and I'm pretty sure @ImkeF will be interested as well),

 

To be honest I was not too comfortable with the efficiency of the M solution I provided earlier, as it included references to individual values - e.g. Removed[Emp_ID]{[Index]} - which is usually a performance killer.

 

So I kept on thinking about a better solution and I have found one.

It took about 3 to 4 seconds to complete a test with 5000 input records, whereas my previous solution ran for something like 10 minutes.

 

The tricks with this new solution are:

  1. A formula is built up in text first, and then evaluated with Expression.Evaluate:
    Transform Example screenshot.png
  2. Using group by Emp_Id, so there is no need to merge the table with itself to get the Emp_Id with the previous Emp_Id on the same row.

Another nice feauture is that it uses a query structure, similar to the one created with the "Combine Binaries" functionality since the November 2016 upgrade.

 

So I have a main query:

 

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Emp_ID", Int64.Type}, {"Date", type datetime}, {"Cumulative Days", Int64.Type}, {"Workday", type logical}, {"Score", Int64.Type}, {"Cumulative Total", type number}}),
    #"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"Cumulative Total"}),
    #"Added Index2" = Table.AddIndexColumn(#"Removed Columns", "Original Sort", 1, 1),
    #"Sorted Rows3" = Table.Sort(#"Added Index2",{{"Emp_ID", Order.Ascending}, {"Date", Order.Ascending}}),
    #"Added Custom" = Table.AddColumn(#"Sorted Rows3", "FormulaTextPart", each if [Workday] then "+" & Text.From([Score]) &")" else "*.5)"),
    #"Grouped Rows" = Table.Buffer(Table.Group(#"Added Custom", {"Emp_ID"}, {{"AllData", each _, type table}})),
    #"Invoked Custom Function" = Table.AddColumn(#"Grouped Rows", "Cumulate", each Cumulate([AllData])),
    #"Removed Columns1" = Table.RemoveColumns(#"Invoked Custom Function",{"AllData"}),
    #"Expanded Cumulate" = Table.ExpandTableColumn(#"Removed Columns1", "Cumulate", {"Date", "Cumulative Days", "Workday", "Score", "Original Sort", "Total Cumulative"}, {"Date", "Cumulative Days", "Workday", "Score", "Original Sort", "Total Cumulative"}),
    #"Sorted Rows" = Table.Sort(#"Expanded Cumulate",{{"Original Sort", Order.Ascending}}),
    #"Removed Columns2" = Table.RemoveColumns(#"Sorted Rows",{"Original Sort"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Removed Columns2",{{"Date", type date}, {"Cumulative Days", Int64.Type}, {"Workday", type logical}, {"Score", Int64.Type}, {"Total Cumulative", type number}})
in
    #"Changed Type1"

 

An Example query:

 

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Emp_ID", Int64.Type}, {"Date", type datetime}, {"Cumulative Days", Int64.Type}, {"Workday", type logical}, {"Score", Int64.Type}, {"Cumulative Total", type number}}),
    #"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"Cumulative Total"}),
    #"Added Index2" = Table.AddIndexColumn(#"Removed Columns", "Original Sort", 1, 1),
    #"Sorted Rows3" = Table.Sort(#"Added Index2",{{"Emp_ID", Order.Ascending}, {"Date", Order.Ascending}}),
    #"Added Custom" = Table.AddColumn(#"Sorted Rows3", "FormulaTextPart", each if [Workday] then "+" & Text.From([Score]) &")" else "*.5)"),
    #"Grouped Rows" = Table.Group(#"Added Custom", {"Emp_ID"}, {{"AllData", each _, type table}}),
    GetFirstTable = Table.Buffer(#"Grouped Rows"{0}[AllData])
in
    GetFirstTable

 

A Transform Example query:

 

let
    Source = #"Example",
    #"Added Index" = Table.AddIndexColumn(Source, "Index", 1, 1),
    #"Added Custom1" = Table.AddColumn(#"Added Index", "Formula", each Text.Repeat("(",[Index])&"0"&Text.Combine(List.Range(#"Added Index"[FormulaTextPart],0,[Index]))),
    #"Added Custom2" = Table.AddColumn(#"Added Custom1", "Total Cumulative", each Expression.Evaluate([Formula])),
    #"Removed Other Columns" = Table.Buffer(Table.SelectColumns(#"Added Custom2",{"Date", "Cumulative Days", "Workday", "Score", "Original Sort", "Total Cumulative"}))
in
    #"Removed Other Columns"

 

From which I created function Cumulate:

 

let
    Source = (#"Example") => let
    Source = #"Example",
    #"Added Index" = Table.AddIndexColumn(Source, "Index", 1, 1),
    #"Added Custom1" = Table.AddColumn(#"Added Index", "Formula", each Text.Repeat("(",[Index])&"0"&Text.Combine(List.Range(#"Added Index"[FormulaTextPart],0,[Index]))),
    #"Added Custom2" = Table.AddColumn(#"Added Custom1", "Total Cumulative", each Expression.Evaluate([Formula])),
    #"Removed Other Columns" = Table.SelectColumns(#"Added Custom2",{"Date", "Cumulative Days", "Workday", "Score", "Original Sort", "Total Cumulative"})
in
    #"Removed Other Columns"
in
    Source

 

Remark: a small difference with the code structure created from the "Combine Binaries" functionality: I use #"Example" as a parameter, although it is not a real parameter but a regular query. As a result, it is not possible to have the function automatically updated with changes in the Transform Example query.

The reason for this difference is, that it is not possible to define a parameter as type table, whereas you can define a parameter as type binary,

Specializing in Power Query Formula Language (M)

Hi Marcel,

this looks like a very cool version, but as I don't like the new combine binaries expierence, personally I'd prefer the List.Generate 😉

 

 I would be interested about its performance if you adjust these 2 aspects:

 

1) Don't apply it on the whole table, but "partition" the table before by grouping on EmpID (you can skip this command then:    Employee = Removed[Emp_ID]{0},)

 

2) Never ever feed a table or a list to List.Generate without buffering it first 

 

Cheers, Imke 🙂

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

MarcelBeug
Community Champion
Community Champion

Sorry @ImkeF, I will never do it again. Smiley Embarassed Smiley Very Happy

 

You're right about the performance: I adjusted the List.Generate solution and this has now comparable performance with my second solution:

I had both run concurrently on 25,000 rows of Input and they both finished after about 15 seconds (scientific level of the test close to 0, but fair enough for me).

 

For you to verify if I understood correctly: I adjusted the query and split it in a query and a function as follows:

 

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    Typed1 = Table.TransformColumnTypes(Source,{{"Emp_ID", Int64.Type}, {"Date", type date}, {"Cumulative Days", Int64.Type}, {"Workday", type logical}, {"Score", Int64.Type}, {"Cumulative Total", type number}}),
    Removed = Table.Buffer(Table.RemoveColumns(Typed1,{"Cumulative Total"})),
    #"Grouped Rows" = Table.Group(Removed, {"Emp_ID"}, {{"AllData", each _, type table}}),
    #"Invoked Custom Function" = Table.AddColumn(#"Grouped Rows", "Cumulated", each Generate([AllData])),
    #"Removed Columns" = Table.RemoveColumns(#"Invoked Custom Function",{"AllData"}),
    #"Expanded Cumulated" = Table.ExpandTableColumn(#"Removed Columns", "Cumulated", {"Date", "Cumulative Days", "Workday", "Score", "Cumulative Total"}, {"Date", "Cumulative Days", "Workday", "Score", "Cumulative Total"}),
    #"Changed Type" = Table.TransformColumnTypes(#"Expanded Cumulated",{{"Date", type date}, {"Cumulative Days", Int64.Type}, {"Workday", type logical}, {"Score", Int64.Type}, {"Cumulative Total", type number}})
in
    #"Changed Type"

 

Function Generate:

 

let
    Source = (TableSoFar as table) =>
let
    Removed = Table.Buffer(TableSoFar),
    Cumulation = List.Generate(() => [Index = 1,
                                     Total = if Removed[Workday]{0} then Removed[Score]{0} else 0],

                            each [Index] <= Table.RowCount(Removed),

                            each [Index = [Index] + 1,
                                  Total = if Removed[Workday]{[Index]}
                                          then [Total] + Removed[Score]{[Index]}
                                          else [Total] * 0.5],

                            each [Total]),

    Records = Table.ToRecords(Removed),
    CombinedTable = Table.FromColumns({Records,Cumulation},{"Records","Cumulative Total"}),
    ExpandedRecords = Table.ExpandRecordColumn(CombinedTable, "Records", {"Emp_ID", "Date", "Cumulative Days", "Workday", "Score"}, {"Emp_ID", "Date", "Cumulative Days", "Workday", "Score"})
in
    ExpandedRecords
in
    Source
Specializing in Power Query Formula Language (M)

Hi all,

 

I've been using running totals quite a bit in my powerquery career. Up until now I was using the List.Sum method as posted here: https://www.excelguru.ca/blog/2015/03/31/create-running-totals-in-power-query/#comment-264936

 

It is my understanding that the List.Sum will have to continously re-read all the rows to do the sum whereas the solution posted here by @MarcelBeug and @ImkeF only needs to read each field once and should be considerably faster, correct?

 

So I am therefore trying to create a generic running total function that works with 'groups' as well. See modified code below. Two questions:

  1. How can we easily implement with dynamic group and value column names. I have tried passing the name as an argument but how can we do this without getting computationally intensive?
  2. How can we implement the expand with dynamic column names?

 

 

 

let
    Source = (TableSoFar as table) =>
let
    Removed = Table.Buffer(TableSoFar),
    Cumulation = List.Generate(
                            
                            //initial
                            () => [
                                    Index = 1,
                                    Total = Removed[Value]{0},
                                    Group = Removed[Group]{0}
                                  ],

                            //condition
                            each [Index] <= Table.RowCount(Removed),
                           
                            //next
                            each [ 
                                    Index = [Index] + 1,
                                    Total = if [Group] = Removed[Group]{[Index]} then [Total] + Removed[Value]{[Index]} else Removed[Value]{[Index]},
                                    Group = Removed[Group]{[Index]}
                                 ],

                            //selector
                            each [Total]),

    Records = Table.ToRecords(Removed),
    CombinedTable = Table.FromColumns({Records,Cumulation},{"Records","Cumulative Total"})
in
    CombinedTable
in
    Source

 

let
    Source = Table.FromRecords({[CustomerID = 1, Name = "Bob", Value= "1"], [CustomerID = 2, Name = "Bob", Value= "3"], [CustomerID = 3, Name = "Paul", Value= "5"], [CustomerID = 3, Name = "Paul", Value= "5"], [CustomerID = 3, Name = "Paul", Value= "5"], [CustomerID = 3, Name = "Jeff", Value= "5"], [CustomerID = 3, Name = "Jeff", Value= "2"], [CustomerID = 3, Name = "Jeff", Value= "3"]}),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Value", Int64.Type}}),
    #"Renamed Columns" = Table.RenameColumns(#"Changed Type",{{"Name", "Group"}}),
    Cumulative = Generate(#"Renamed Columns"),
    #"Expanded Records" = Table.ExpandRecordColumn(Cumulative, "Records", {"CustomerID", "Group", "Value"}, {"CustomerID", "Group", "Value"})
in
    #"Expanded Records"

 

Hi,

I don't have much time currently, so I will just post the code here for you to follow (should write a blogpost about it, as the code is really fast):

 

let
func = (Table as table, AmountColumn as text, optional SortColumn as text, optional GroupColumns) =>
let

/* Debug parameters
Table = #"Query1", 
SortColumn = null, 
AmountColumn = "Value",
GroupColumns = {"Name"},
*/
    GroupedTable = Table.Group(Table, GroupColumns, {{"Partition", each _, type table}}),
    fnRunningTotal = (Table as table, AmountColumn as text, optional SortColumn as text, optional GroupColumns) =>

    let
    Source = if SortColumn = null then Table else Table.Buffer(Table.Sort(Table,{{SortColumn, Order.Ascending}})),
// Sort table and buffer it
    Buffered = Table.Buffer(Table.AddIndexColumn(Source, "Index",1,1)),
// Select the Columns
    SelectColumns = Table.SelectColumns(Buffered, if SortColumn = null then {AmountColumn, "Index"} else {SortColumn, AmountColumn, "Index"}),
// Extract Amount column and buffer it
    ExtractAmountColumn = List.Buffer(Table.Column(SelectColumns, AmountColumn)),
// Calculate a list with all running Totals
    RunningTotal = List.Skip(List.Generate( ()=> [ListItem=0, Counter=0]
                                            ,each [Counter] <= List.Count(ExtractAmountColumn)
                                            ,each [ ListItem = ExtractAmountColumn{[Counter]}+[ListItem],
                                                    Counter = [Counter]+1
                                                   ]
                                            ),1),
    ConvertedTable = Table.FromList(RunningTotal, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    ExpandedColumn = Table.ExpandRecordColumn( ConvertedTable, "Column1", {"ListItem", "Counter"}, {"ListItem", "Counter"}),
    MergedQueries = Table.NestedJoin(Buffered,{"Index"}, ExpandedColumn,{"Counter"},"Expanded Column1",JoinKind.LeftOuter),
    Expand = Table.ExpandTableColumn( MergedQueries, "Expanded Column1", {"ListItem"}, {"RunningTotal"}),
    #"Removed Columns" = Table.RemoveColumns(Expand,{"Index"}),
    #"Changed Type" = Table.TransformColumnTypes(#"Removed Columns",{{"RunningTotal", type number}})
in
    #"Changed Type",

    AddRTToGroup = Table.AddColumn(GroupedTable, "Custom", each fnRunningTotal([Partition], AmountColumn)),
    #"Removed Other Columns" = Table.SelectColumns(AddRTToGroup,{"Custom"}),
    ExpandedGroup = Table.ExpandTableColumn(#"Removed Other Columns", "Custom", Table.ColumnNames(Table) & {"RunningTotal"}),
    WithoutGroup = fnRunningTotal(Table, "Value"),
    Result = if GroupColumns = null then WithoutGroup else ExpandedGroup 
in
    Result
//*
, 
    documentation = [ Documentation.Name =  " Table.ColumnRunningTotal"
        ,Documentation.Description = " Fast way to add running total to a table"
        ,Documentation.LongDescription = " Fast way to add running total to a table"
        ,Documentation.Category = " Table"
        ,Documentation.Source = " local"
        ,Documentation.Author = " Imke Feldmann: www.TheBIccountant.com"
        ,Documentation.Examples = {[Description =  " ", Code = " ", Result = " "]}] 
 in 
  Value.ReplaceType(func, Value.ReplaceMetadata(Value.Type(func), documentation))

 

For your table, you would call it like so: 

 

MyFunction(YourTable, "Value", null, {"Name"})

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Hi all,

 

I've been using running totals quite a bit in my powerquery career. Up until now I was using the List.Sum method as posted here: https://www.excelguru.ca/blog/2015/03/31/create-running-totals-in-power-query/#comment-264936

 

It is my understanding that the List.Sum will have to continously re-read all the rows to do the sum whereas the solution posted here by @MarcelBeug and @ImkeF only needs to read each field once and should be considerably faster, correct?

 

So I am therefore trying to create a generic running total function that works with 'groups' as well. See modified code below. Two questions:

  1. How can we easily implement with dynamic group and value column names. I have tried passing the name as an argument but how can we do this without getting computationally intensive?
  2. How can we implement the expand with dynamic column names?

 

 

 

let
    Source = (TableSoFar as table) =>
let
    Removed = Table.Buffer(TableSoFar),
    Cumulation = List.Generate(
                            
                            //initial
                            () => [
                                    Index = 1,
                                    Total = Removed[Value]{0},
                                    Group = Removed[Group]{0}
                                  ],

                            //condition
                            each [Index] <= Table.RowCount(Removed),
                           
                            //next
                            each [ 
                                    Index = [Index] + 1,
                                    Total = if [Group] = Removed[Group]{[Index]} then [Total] + Removed[Value]{[Index]} else Removed[Value]{[Index]},
                                    Group = Removed[Group]{[Index]}
                                 ],

                            //selector
                            each [Total]),

    Records = Table.ToRecords(Removed),
    CombinedTable = Table.FromColumns({Records,Cumulation},{"Records","Cumulative Total"})
in
    CombinedTable
in
    Source

 

let
Source = Table.FromRecords({[CustomerID = 1, Name = "Bob", Value= "1"], [CustomerID = 2, Name = "Bob", Value= "3"], [CustomerID = 3, Name = "Paul", Value= "5"], [CustomerID = 3, Name = "Paul", Value= "5"], [CustomerID = 3, Name = "Paul", Value= "5"], [CustomerID = 3, Name = "Jeff", Value= "5"], [CustomerID = 3, Name = "Jeff", Value= "2"], [CustomerID = 3, Name = "Jeff", Value= "3"]}),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Value", Int64.Type}}),
#"Renamed Columns" = Table.RenameColumns(#"Changed Type",{{"Name", "Group"}}),
#"Invoked Custom Function" = Generate(#"Renamed Columns"),
#"Expanded Records" = Table.ExpandRecordColumn(#"Invoked Custom Function", "Records", {"CustomerID", "Group", "Value"}, {"CustomerID", "Group", "Value"})
in
#"Expanded Records"

 

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.