Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.
Okay, so this isn't strictly a PowerBI problem, but rather a DAX problem that I'm trying to solve in PowerPivot./Excel, although we will be migrating the workbook to PowerBI in due course.
So here's my problem:
I have some data (90,000 rows) I'm trying to use to calculate a cumulative "fatigue score" for folk working shifts...currently using PowerPivot/Excel 2016.
As per the below screenshot, the dataset is shift data for multiple employees, that has a cumulative count of days worked vs. days off that resets back to 1 whenever they switch from one state to the other, and a 'Score' column that in my production data contains a measure of how fatigued they are.
I would like to cumulatively sum that fatigue score, and halve the cumulative sum for each line that Workday is FALSE. . My desired output is in the 'Cumulative Total' column far right, and I've used grey highlighting to show days worked vs. days off as well as put a bold border around separate Emp_ID blocks to help demonstrate the data.
This is very similar to a question I asked previously at StackOverflow, with the difference that I need the cumulative total to halve every time the Workday value is FALSE. Alejondro came up with a solution to that problem that looked like this:
Cumulative Score =
CALCULATE (
SUM ( 'Shifts'[Score] ),
FILTER (
ALLSELECTED ( Shifts ),
[Helper] = MAX ( [Helper] )
&& [EMP_ID] = MAX ( Shifts[EMP_ID] )
&& [Date] <= MAX ( Shifts[Date] )
)
)
...but I need to amend this to halve the cumulative total each time the Workday value is FALSE.
While making these kinds of adjustments to cumulative totals is very simple in Excel, I'm not sure this is possible in DAX. Currently we are in fact using PowerPivot to serve up the data in Excel, and then using Excel to create the cumulative logic, and then pulling the cumulative totals back into PowerPivot via a linked table, which works but is clunky.
Solved! Go to Solution.
Hi @jeffreyweir
@Phil_Seamark alerted me to this thread.
Yes, your Cumulative Total can be done in a DAX calculated column. (You could also consider Power Query).
Sample pbix here to illustrate the DAX.
A version of the column is:
Cumulative Total (new DAX) = VAR OuterDate = Shifts[Date] RETURN SUMX ( CALCULATETABLE ( Shifts, ALLEXCEPT ( Shifts, Shifts[Emp_ID] ), Shifts[Date] <= OuterDate ), VAR InnerDate = Shifts[Date] RETURN POWER ( 0.5, CALCULATE ( COUNTROWS ( Shifts ), ALLEXCEPT ( Shifts, Shifts[Emp_ID] ), Shifts[Date] <= OuterDate, Shifts[Date] >= InnerDate, NOT ( Shifts[Workday] ) ) ) * Shifts[Score] )
Out of interest, a version that works in "old DAX" without variables is:
Cumulative Total (old DAX) = SUMX ( CALCULATETABLE ( Shifts, ALLEXCEPT ( Shifts, Shifts[Emp_ID] ), Shifts[Date] <= EARLIER ( Shifts[Date] ) ), POWER ( 0.5, CALCULATE ( COUNTROWS ( Shifts ), ALLEXCEPT ( Shifts, Shifts[Emp_ID] ), Shifts[Date] <= EARLIER ( Shifts[Date], 2 ), Shifts[Date] >= EARLIER ( Shifts[Date] ), NOT ( Shifts[Workday] ) ) ) * Shifts[Score] )
The way I've replicated the recursive calculation is to
The reason this works is that if you imagine calculating the Cumulative Total for each row in turn, every time you hit another non-work day, each previous row gets multiplied by another factor of 0.5.
Cheers,
Owen
This works almost perfectly in production. Almost, because it doesn't correctly handle the situation whereby an employee doesn
This works almost perfectly in production. Almost, because it doesn't correctly handle the situation when the first few days are not workdays. On the left is what I'm after, on the right is what I get.
That said, this is probably good enough for my needs. But it sure would be cool if the function could be amended to only start calculating from the first workday, so that those initial non workdays don't cause the first reading to be halved.
.
Hi @jeffreyweir
Could you show the other columns alongside your extract above?
Was just trying to work out if you have a non-zero score sitting on those initial non-workdays.
Are you saying that if there is a score on the initial non-workdays, you only want it included in the Cumulative sum from the first workday onwards, and no halving to occur before the first workday?
Cheers,
Owen
Hi Owen. It was a zero score...it's always a zero score (i.e. no 'fatigue points') on a non workday. So
Other than that, your DAX works fine. But it must be REALLY computationally expensive, as I'm having to wait around 6 minutes sometimes when saving the file, or switching between Excel and the PowerPivot window, or making any kind of change in the PowerPivot window such as changing the name of a column or deleting it.
Previously I was doing the cumulative bit in Excel (using a PivotTable to get the raw data there so I could then do the calcs in a Table alongside), then sucking it back into the data model via a linked table. Which works fine, and I must say is much faster. But more complicated than an all DAX solution.
I'll try the PowerQuery solution, and see what imact that has on performance. I'll also try migrating the entire project from PowerPIvot to Analysis Services, and see what kind of performance I get. But I may yet have to keep doing the cumulative calcs in Excel. Which isn't a problem...it just requires an extra refresh, as I need to get refreshed data into Excel, and then once Excel creates the cumulative total I need to refresh the linked cumulative table so that the newly updated cumulative calcs get posted back to the data model.
I certainly appreciate the interest in this thread...will be a very good learning opportunity even if we end up with the status quo.
Strange: when I tested the 3 solutions, these included cases that started with a non working day and they all worked fine.
However, in my test data, all non working days had a score of 0 (just like in the examples in post #1).
In the case that a first non working day has a score, the Power Query solutions still work fine but the DAX solution starts with this score.
I guess that wouldn't be to hard to have that adjusted, if so required,
Hi @jeffreyweir,
So what columns does your dataset already have, and which columns would you like a calcualted column/measure for?
And do you have a txt version of the data to make it easier to import, rather than a photo to help speed up the suggestions 🙂
Whoops, sorry Phil...forgot to upload that file. Will upload to onedrive and post link as soon as I'm able (can't do it from work) but meanwhile have emailed you a copy in spreadsheet form.
In regards to the rest of your question, I'm not quite sure I follow. Using the sample data, the calculated column needs to match the sample results in the 'Cumulative Total' column, and it needs to reference the Emp_ID column, the Workday column, and the Score column.
Here's a link to the sample file:
https://1drv.ms/x/s!Ah_zTnaUo4Dzjnt6KWy8ZxWAEfPC
Hi @jeffreyweir
@Phil_Seamark alerted me to this thread.
Yes, your Cumulative Total can be done in a DAX calculated column. (You could also consider Power Query).
Sample pbix here to illustrate the DAX.
A version of the column is:
Cumulative Total (new DAX) = VAR OuterDate = Shifts[Date] RETURN SUMX ( CALCULATETABLE ( Shifts, ALLEXCEPT ( Shifts, Shifts[Emp_ID] ), Shifts[Date] <= OuterDate ), VAR InnerDate = Shifts[Date] RETURN POWER ( 0.5, CALCULATE ( COUNTROWS ( Shifts ), ALLEXCEPT ( Shifts, Shifts[Emp_ID] ), Shifts[Date] <= OuterDate, Shifts[Date] >= InnerDate, NOT ( Shifts[Workday] ) ) ) * Shifts[Score] )
Out of interest, a version that works in "old DAX" without variables is:
Cumulative Total (old DAX) = SUMX ( CALCULATETABLE ( Shifts, ALLEXCEPT ( Shifts, Shifts[Emp_ID] ), Shifts[Date] <= EARLIER ( Shifts[Date] ) ), POWER ( 0.5, CALCULATE ( COUNTROWS ( Shifts ), ALLEXCEPT ( Shifts, Shifts[Emp_ID] ), Shifts[Date] <= EARLIER ( Shifts[Date], 2 ), Shifts[Date] >= EARLIER ( Shifts[Date] ), NOT ( Shifts[Workday] ) ) ) * Shifts[Score] )
The way I've replicated the recursive calculation is to
The reason this works is that if you imagine calculating the Cumulative Total for each row in turn, every time you hit another non-work day, each previous row gets multiplied by another factor of 0.5.
Cheers,
Owen
Far out! When I half explained this issue to Phil the other day and he notchelently said "yeah, DAX can do that", well...to quote The Castle, I thought he was "dreamin".
This solution is going "straight to the Pool Room."
Now to step through it. I'm an Excel Formulas and VBA dude, so DAX is still pretty new to me. This has doubled my interest in DAX. Many, many thanks.
@OwenAuger Great Job again!
@Phil_Seamark I was not responding to you because I was looking into this...
Now have to process Owen's solution
EDIT: @MarcelBeug ??? we are all waiting for the M solution
I had a look at an M approach earlier using List.Accumulate but it didn't provide enough granularity over the iterations to change the operation. I didn't look into creating a function. I definitey think @MarcelBeug will be able to come up with something pretty cool.
Would this List.Generate based M Solution be cool enough?
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content], Typed1 = Table.TransformColumnTypes(Source,{{"Emp_ID", Int64.Type}, {"Date", type date}, {"Cumulative Days", Int64.Type}, {"Workday", type logical}, {"Score", Int64.Type}, {"Cumulative Total", type number}}), Removed = Table.RemoveColumns(Typed1,{"Cumulative Total"}), Cumulation = List.Generate(() => [Index = 1, Employee = Removed[Emp_ID]{0}, Total = if Removed[Workday]{0} then Removed[Score]{0} else 0], each [Index] <= Table.RowCount(Removed), each [Index = [Index] + 1, Employee = Removed[Emp_ID]{[Index]}, Total = if Employee <> [Employee] then if Removed[Workday]{[Index]} then Removed[Score]{[Index]} else 0 else if Removed[Workday]{[Index]} then [Total] + Removed[Score]{[Index]} else [Total] * 0.5], each [Total]), Records = Table.ToRecords(Removed), CombinedTable = Table.FromColumns({Records,Cumulation},{"Records","Cumulative Total"}), ExpandedRecords = Table.ExpandRecordColumn(CombinedTable, "Records", {"Emp_ID", "Date", "Cumulative Days", "Workday", "Score"}, {"Emp_ID", "Date", "Cumulative Days", "Workday", "Score"}), Typed2 = Table.TransformColumnTypes(ExpandedRecords,{{"Cumulative Total", type number}}) in Typed2
@Phil_Seamark, @jeffreyweir, @Sean, @OwenAuger
Guys (and I'm pretty sure @ImkeF will be interested as well),
To be honest I was not too comfortable with the efficiency of the M solution I provided earlier, as it included references to individual values - e.g. Removed[Emp_ID]{[Index]} - which is usually a performance killer.
So I kept on thinking about a better solution and I have found one.
It took about 3 to 4 seconds to complete a test with 5000 input records, whereas my previous solution ran for something like 10 minutes.
The tricks with this new solution are:
Another nice feauture is that it uses a query structure, similar to the one created with the "Combine Binaries" functionality since the November 2016 upgrade.
So I have a main query:
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content], #"Changed Type" = Table.TransformColumnTypes(Source,{{"Emp_ID", Int64.Type}, {"Date", type datetime}, {"Cumulative Days", Int64.Type}, {"Workday", type logical}, {"Score", Int64.Type}, {"Cumulative Total", type number}}), #"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"Cumulative Total"}), #"Added Index2" = Table.AddIndexColumn(#"Removed Columns", "Original Sort", 1, 1), #"Sorted Rows3" = Table.Sort(#"Added Index2",{{"Emp_ID", Order.Ascending}, {"Date", Order.Ascending}}), #"Added Custom" = Table.AddColumn(#"Sorted Rows3", "FormulaTextPart", each if [Workday] then "+" & Text.From([Score]) &")" else "*.5)"), #"Grouped Rows" = Table.Buffer(Table.Group(#"Added Custom", {"Emp_ID"}, {{"AllData", each _, type table}})), #"Invoked Custom Function" = Table.AddColumn(#"Grouped Rows", "Cumulate", each Cumulate([AllData])), #"Removed Columns1" = Table.RemoveColumns(#"Invoked Custom Function",{"AllData"}), #"Expanded Cumulate" = Table.ExpandTableColumn(#"Removed Columns1", "Cumulate", {"Date", "Cumulative Days", "Workday", "Score", "Original Sort", "Total Cumulative"}, {"Date", "Cumulative Days", "Workday", "Score", "Original Sort", "Total Cumulative"}), #"Sorted Rows" = Table.Sort(#"Expanded Cumulate",{{"Original Sort", Order.Ascending}}), #"Removed Columns2" = Table.RemoveColumns(#"Sorted Rows",{"Original Sort"}), #"Changed Type1" = Table.TransformColumnTypes(#"Removed Columns2",{{"Date", type date}, {"Cumulative Days", Int64.Type}, {"Workday", type logical}, {"Score", Int64.Type}, {"Total Cumulative", type number}}) in #"Changed Type1"
An Example query:
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content], #"Changed Type" = Table.TransformColumnTypes(Source,{{"Emp_ID", Int64.Type}, {"Date", type datetime}, {"Cumulative Days", Int64.Type}, {"Workday", type logical}, {"Score", Int64.Type}, {"Cumulative Total", type number}}), #"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"Cumulative Total"}), #"Added Index2" = Table.AddIndexColumn(#"Removed Columns", "Original Sort", 1, 1), #"Sorted Rows3" = Table.Sort(#"Added Index2",{{"Emp_ID", Order.Ascending}, {"Date", Order.Ascending}}), #"Added Custom" = Table.AddColumn(#"Sorted Rows3", "FormulaTextPart", each if [Workday] then "+" & Text.From([Score]) &")" else "*.5)"), #"Grouped Rows" = Table.Group(#"Added Custom", {"Emp_ID"}, {{"AllData", each _, type table}}), GetFirstTable = Table.Buffer(#"Grouped Rows"{0}[AllData]) in GetFirstTable
A Transform Example query:
let Source = #"Example", #"Added Index" = Table.AddIndexColumn(Source, "Index", 1, 1), #"Added Custom1" = Table.AddColumn(#"Added Index", "Formula", each Text.Repeat("(",[Index])&"0"&Text.Combine(List.Range(#"Added Index"[FormulaTextPart],0,[Index]))), #"Added Custom2" = Table.AddColumn(#"Added Custom1", "Total Cumulative", each Expression.Evaluate([Formula])), #"Removed Other Columns" = Table.Buffer(Table.SelectColumns(#"Added Custom2",{"Date", "Cumulative Days", "Workday", "Score", "Original Sort", "Total Cumulative"})) in #"Removed Other Columns"
From which I created function Cumulate:
let Source = (#"Example") => let Source = #"Example", #"Added Index" = Table.AddIndexColumn(Source, "Index", 1, 1), #"Added Custom1" = Table.AddColumn(#"Added Index", "Formula", each Text.Repeat("(",[Index])&"0"&Text.Combine(List.Range(#"Added Index"[FormulaTextPart],0,[Index]))), #"Added Custom2" = Table.AddColumn(#"Added Custom1", "Total Cumulative", each Expression.Evaluate([Formula])), #"Removed Other Columns" = Table.SelectColumns(#"Added Custom2",{"Date", "Cumulative Days", "Workday", "Score", "Original Sort", "Total Cumulative"}) in #"Removed Other Columns" in Source
Remark: a small difference with the code structure created from the "Combine Binaries" functionality: I use #"Example" as a parameter, although it is not a real parameter but a regular query. As a result, it is not possible to have the function automatically updated with changes in the Transform Example query.
The reason for this difference is, that it is not possible to define a parameter as type table, whereas you can define a parameter as type binary,
Hi Marcel,
this looks like a very cool version, but as I don't like the new combine binaries expierence, personally I'd prefer the List.Generate 😉
I would be interested about its performance if you adjust these 2 aspects:
1) Don't apply it on the whole table, but "partition" the table before by grouping on EmpID (you can skip this command then: Employee = Removed[Emp_ID]{0},)
2) Never ever feed a table or a list to List.Generate without buffering it first
Cheers, Imke 🙂
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
Sorry @ImkeF, I will never do it again.
You're right about the performance: I adjusted the List.Generate solution and this has now comparable performance with my second solution:
I had both run concurrently on 25,000 rows of Input and they both finished after about 15 seconds (scientific level of the test close to 0, but fair enough for me).
For you to verify if I understood correctly: I adjusted the query and split it in a query and a function as follows:
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content], Typed1 = Table.TransformColumnTypes(Source,{{"Emp_ID", Int64.Type}, {"Date", type date}, {"Cumulative Days", Int64.Type}, {"Workday", type logical}, {"Score", Int64.Type}, {"Cumulative Total", type number}}), Removed = Table.Buffer(Table.RemoveColumns(Typed1,{"Cumulative Total"})), #"Grouped Rows" = Table.Group(Removed, {"Emp_ID"}, {{"AllData", each _, type table}}), #"Invoked Custom Function" = Table.AddColumn(#"Grouped Rows", "Cumulated", each Generate([AllData])), #"Removed Columns" = Table.RemoveColumns(#"Invoked Custom Function",{"AllData"}), #"Expanded Cumulated" = Table.ExpandTableColumn(#"Removed Columns", "Cumulated", {"Date", "Cumulative Days", "Workday", "Score", "Cumulative Total"}, {"Date", "Cumulative Days", "Workday", "Score", "Cumulative Total"}), #"Changed Type" = Table.TransformColumnTypes(#"Expanded Cumulated",{{"Date", type date}, {"Cumulative Days", Int64.Type}, {"Workday", type logical}, {"Score", Int64.Type}, {"Cumulative Total", type number}}) in #"Changed Type"
Function Generate:
let Source = (TableSoFar as table) => let Removed = Table.Buffer(TableSoFar), Cumulation = List.Generate(() => [Index = 1, Total = if Removed[Workday]{0} then Removed[Score]{0} else 0], each [Index] <= Table.RowCount(Removed), each [Index = [Index] + 1, Total = if Removed[Workday]{[Index]} then [Total] + Removed[Score]{[Index]} else [Total] * 0.5], each [Total]), Records = Table.ToRecords(Removed), CombinedTable = Table.FromColumns({Records,Cumulation},{"Records","Cumulative Total"}), ExpandedRecords = Table.ExpandRecordColumn(CombinedTable, "Records", {"Emp_ID", "Date", "Cumulative Days", "Workday", "Score"}, {"Emp_ID", "Date", "Cumulative Days", "Workday", "Score"}) in ExpandedRecords in Source
Hi all,
I've been using running totals quite a bit in my powerquery career. Up until now I was using the List.Sum method as posted here: https://www.excelguru.ca/blog/2015/03/31/create-running-totals-in-power-query/#comment-264936
It is my understanding that the List.Sum will have to continously re-read all the rows to do the sum whereas the solution posted here by @MarcelBeug and @ImkeF only needs to read each field once and should be considerably faster, correct?
So I am therefore trying to create a generic running total function that works with 'groups' as well. See modified code below. Two questions:
let Source = (TableSoFar as table) => let Removed = Table.Buffer(TableSoFar), Cumulation = List.Generate( //initial () => [ Index = 1, Total = Removed[Value]{0}, Group = Removed[Group]{0} ], //condition each [Index] <= Table.RowCount(Removed), //next each [ Index = [Index] + 1, Total = if [Group] = Removed[Group]{[Index]} then [Total] + Removed[Value]{[Index]} else Removed[Value]{[Index]}, Group = Removed[Group]{[Index]} ], //selector each [Total]), Records = Table.ToRecords(Removed), CombinedTable = Table.FromColumns({Records,Cumulation},{"Records","Cumulative Total"}) in CombinedTable in Source
let Source = Table.FromRecords({[CustomerID = 1, Name = "Bob", Value= "1"], [CustomerID = 2, Name = "Bob", Value= "3"], [CustomerID = 3, Name = "Paul", Value= "5"], [CustomerID = 3, Name = "Paul", Value= "5"], [CustomerID = 3, Name = "Paul", Value= "5"], [CustomerID = 3, Name = "Jeff", Value= "5"], [CustomerID = 3, Name = "Jeff", Value= "2"], [CustomerID = 3, Name = "Jeff", Value= "3"]}), #"Changed Type" = Table.TransformColumnTypes(Source,{{"Value", Int64.Type}}), #"Renamed Columns" = Table.RenameColumns(#"Changed Type",{{"Name", "Group"}}), Cumulative = Generate(#"Renamed Columns"), #"Expanded Records" = Table.ExpandRecordColumn(Cumulative, "Records", {"CustomerID", "Group", "Value"}, {"CustomerID", "Group", "Value"}) in #"Expanded Records"
Hi,
I don't have much time currently, so I will just post the code here for you to follow (should write a blogpost about it, as the code is really fast):
let func = (Table as table, AmountColumn as text, optional SortColumn as text, optional GroupColumns) => let /* Debug parameters Table = #"Query1", SortColumn = null, AmountColumn = "Value", GroupColumns = {"Name"}, */ GroupedTable = Table.Group(Table, GroupColumns, {{"Partition", each _, type table}}), fnRunningTotal = (Table as table, AmountColumn as text, optional SortColumn as text, optional GroupColumns) => let Source = if SortColumn = null then Table else Table.Buffer(Table.Sort(Table,{{SortColumn, Order.Ascending}})), // Sort table and buffer it Buffered = Table.Buffer(Table.AddIndexColumn(Source, "Index",1,1)), // Select the Columns SelectColumns = Table.SelectColumns(Buffered, if SortColumn = null then {AmountColumn, "Index"} else {SortColumn, AmountColumn, "Index"}), // Extract Amount column and buffer it ExtractAmountColumn = List.Buffer(Table.Column(SelectColumns, AmountColumn)), // Calculate a list with all running Totals RunningTotal = List.Skip(List.Generate( ()=> [ListItem=0, Counter=0] ,each [Counter] <= List.Count(ExtractAmountColumn) ,each [ ListItem = ExtractAmountColumn{[Counter]}+[ListItem], Counter = [Counter]+1 ] ),1), ConvertedTable = Table.FromList(RunningTotal, Splitter.SplitByNothing(), null, null, ExtraValues.Error), ExpandedColumn = Table.ExpandRecordColumn( ConvertedTable, "Column1", {"ListItem", "Counter"}, {"ListItem", "Counter"}), MergedQueries = Table.NestedJoin(Buffered,{"Index"}, ExpandedColumn,{"Counter"},"Expanded Column1",JoinKind.LeftOuter), Expand = Table.ExpandTableColumn( MergedQueries, "Expanded Column1", {"ListItem"}, {"RunningTotal"}), #"Removed Columns" = Table.RemoveColumns(Expand,{"Index"}), #"Changed Type" = Table.TransformColumnTypes(#"Removed Columns",{{"RunningTotal", type number}}) in #"Changed Type", AddRTToGroup = Table.AddColumn(GroupedTable, "Custom", each fnRunningTotal([Partition], AmountColumn)), #"Removed Other Columns" = Table.SelectColumns(AddRTToGroup,{"Custom"}), ExpandedGroup = Table.ExpandTableColumn(#"Removed Other Columns", "Custom", Table.ColumnNames(Table) & {"RunningTotal"}), WithoutGroup = fnRunningTotal(Table, "Value"), Result = if GroupColumns = null then WithoutGroup else ExpandedGroup in Result //* , documentation = [ Documentation.Name = " Table.ColumnRunningTotal" ,Documentation.Description = " Fast way to add running total to a table" ,Documentation.LongDescription = " Fast way to add running total to a table" ,Documentation.Category = " Table" ,Documentation.Source = " local" ,Documentation.Author = " Imke Feldmann: www.TheBIccountant.com" ,Documentation.Examples = {[Description = " ", Code = " ", Result = " "]}] in Value.ReplaceType(func, Value.ReplaceMetadata(Value.Type(func), documentation))
For your table, you would call it like so:
MyFunction(YourTable, "Value", null, {"Name"})
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
Hi all,
I've been using running totals quite a bit in my powerquery career. Up until now I was using the List.Sum method as posted here: https://www.excelguru.ca/blog/2015/03/31/create-running-totals-in-power-query/#comment-264936
It is my understanding that the List.Sum will have to continously re-read all the rows to do the sum whereas the solution posted here by @MarcelBeug and @ImkeF only needs to read each field once and should be considerably faster, correct?
So I am therefore trying to create a generic running total function that works with 'groups' as well. See modified code below. Two questions:
let Source = (TableSoFar as table) => let Removed = Table.Buffer(TableSoFar), Cumulation = List.Generate( //initial () => [ Index = 1, Total = Removed[Value]{0}, Group = Removed[Group]{0} ], //condition each [Index] <= Table.RowCount(Removed), //next each [ Index = [Index] + 1, Total = if [Group] = Removed[Group]{[Index]} then [Total] + Removed[Value]{[Index]} else Removed[Value]{[Index]}, Group = Removed[Group]{[Index]} ], //selector each [Total]), Records = Table.ToRecords(Removed), CombinedTable = Table.FromColumns({Records,Cumulation},{"Records","Cumulative Total"}) in CombinedTable in Source
let
Source = Table.FromRecords({[CustomerID = 1, Name = "Bob", Value= "1"], [CustomerID = 2, Name = "Bob", Value= "3"], [CustomerID = 3, Name = "Paul", Value= "5"], [CustomerID = 3, Name = "Paul", Value= "5"], [CustomerID = 3, Name = "Paul", Value= "5"], [CustomerID = 3, Name = "Jeff", Value= "5"], [CustomerID = 3, Name = "Jeff", Value= "2"], [CustomerID = 3, Name = "Jeff", Value= "3"]}),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Value", Int64.Type}}),
#"Renamed Columns" = Table.RenameColumns(#"Changed Type",{{"Name", "Group"}}),
#"Invoked Custom Function" = Generate(#"Renamed Columns"),
#"Expanded Records" = Table.ExpandRecordColumn(#"Invoked Custom Function", "Records", {"CustomerID", "Group", "Value"}, {"CustomerID", "Group", "Value"})
in
#"Expanded Records"
User | Count |
---|---|
103 | |
88 | |
78 | |
71 | |
69 |
User | Count |
---|---|
113 | |
99 | |
97 | |
72 | |
68 |