Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more.
Get startedGrow your Fabric skills and prepare for the DP-600 certification exam by completing the latest Microsoft Fabric challenge.
Hi
I have 3 colums: a dataday, an expiry date column and a volumen column (you can see in the picture: each dataday has an expiry date and a unit value.)
How can I calculate a cumulative amount for the unit values in Power Query Editor, so that the calculation skips the datadays and unit values with an earlier expiry date than the dataday?
For example, there is an expiry date of 8 June for a 7 June dataday (where the volume is 10) and there is an expiry date of 8 June for a 8 June dataday (where the volume is 10) and therefore I don't want it to be included in the cumulative amount of 9 June. The correct result is therefore 71 for 9 June and not 91 because the expiry dates of 8 June are excluded from the aggregation.
Thank you very much!
@streli To calculate the cumulative amount while excluding the unit values with earlier expiry dates in Power Query Editor, you can use the following steps:
Assuming your table is named "YourTable" and the columns are named "DateDay," "ExpiryDate," and "Volume," follow these steps in Power Query Editor:
Step 1: Sort the data:
Sort the table by the "DateDay" column in ascending order and then by the "ExpiryDate" column in descending order. This will ensure that the unit values with earlier expiry dates are listed first and will be excluded from the cumulative calculation.
Step 2: Add an Index Column:
Add an Index column to the table. This will create a unique identifier for each row, which is essential for the next steps.
Step 3: Duplicate the table:
Duplicate the table using the "Duplicate" feature in Power Query Editor. This will create a copy of the original table that we can use for the cumulative calculation.
Step 4: Merge Queries:
Merge the original table with the duplicated table based on the "DateDay" and "ExpiryDate" columns. The merge should be a Left Anti join, which will keep only the rows from the original table where there is no match in the duplicated table (i.e., where the unit values have an earlier expiry date). This will effectively exclude those rows from the cumulative calculation.
Step 5: Group and Aggregate:
Group the resulting table by "DateDay" and aggregate the "Volume" column using the "Sum" function. This will calculate the cumulative amount, excluding the unit values with earlier expiry dates.
Step 6: Merge the Results:
Merge the aggregated table back with the original table based on the "DateDay" column. This will bring the cumulative amount back to the original table.
Step 7: Remove Unnecessary Columns:
Remove the duplicated table and any intermediate columns created during the process, keeping only the "DateDay" and "Volume" columns.
Now, your table should have a new column showing the cumulative amount while excluding the unit values with earlier expiry dates.
Hi Resolutions, thank you very much for your reply, I will be able to check your solution proposal on Monday. But I have Direct Query connect, in which case I cannot add Index Column...
Your description is not clear to me. For example, you write:
But there are no examples in your data where the expiry date occurs before the dataday date on the same row
If what you really mean is to exclude those situations where the expiry date is the same as or before the data day, there is only a single instance (where they are the same.
If you mean something else, you can adjust the comparison operator in the code below.
let
//This group of lines is to transform the result of pasting your screenshot into Excel into something usable.
// replace them with your actual data source
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type date}, {"Column2", type text}}),
#"Split Column by Delimiter" = Table.SplitColumn(#"Changed Type", "Column2", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"Column2.1", "Column2.2"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Column2.1", Int64.Type}, {"Column2.2", type date}}),
#"Renamed Columns" = Table.RenameColumns(#"Changed Type1",{{"Column1", "DataDay"}, {"Column2.1", "Volume"}, {"Column2.2", "ExpiryDay"}}),
//-----------------------------------
//Group Rows by DataDay Sum the Volume
#"Grouped Rows" = Table.Group(#"Renamed Columns", {"DataDay"}, {
{"Total", (t)=> List.Sum(Table.SelectRows(t, each [ExpiryDay]>[DataDay])[Volume]), type nullable number}}),
//Add Running Total Column
#"Running Total" =
let
vol = List.Buffer(#"Grouped Rows"[Total]),
runT = List.Generate(
()=>[rt = vol{0}, idx=0],
each [idx] < List.Count(vol),
each [rt = [rt] + vol{[idx]+1}, idx = [idx]+1],
each [rt]),
res = Table.FromColumns(
Table.ToColumns(#"Grouped Rows") & {runT},
type table[DataDay=date, Volume=Int64.Type, Volume Running Total=Int64.Type])
in
res
in
#"Running Total"
Results from your Data
If you want something else, please add sufficient detail
Hi, thank you very much for your reply, in the aggregate value for June 9 (and June 10), I would like to exclude these two values because these dates are earlier than June 9 (June 8 < June 9), so the correct value for June 9 would be 71 and the correct value for June 10 would be 92
Ok, I think I understand now. We will try the following:
Note that the first lines are merely to transform your data so I can use it, since you only provided a screenshot and not text information that could be copy/pasted
Running Total Custom Function
Rename fnRT
//Running Total Returns a Table
//Rename: "fnRT"
(tbl as table, col as text)=>
let
values = List.Buffer(Table.Column(tbl,col)),
colNames= List.Buffer(Table.ColumnNames(tbl)),
RT = List.Generate(
()=> [rt = values{0}, idx = 0],
each [idx] < List.Count(values),
each [rt = [rt] + values{[idx]+1}, idx = [idx]+1],
each [rt]
),
res = Table.FromColumns(
Table.ToColumns(tbl) & {RT},
colNames & {col & " Running Total"}
)
in
res
Main Query
let
Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"DataDay", type date}, {"Volume", Int64.Type}, {"ExpiryDay", type date}}),
//Group Rows by DataDay; Sum the Volume
#"Grouped DataDay" = Table.Group(#"Changed Type", {"DataDay"}, {
{"Totals by DataDay", each List.Sum([Volume]), Int64.Type}}),
#"Sorted DataDay" = Table.Sort(#"Grouped DataDay",{{"DataDay", Order.Ascending}}),
//Group Rows by Expiry Day; Sum the Volume
#"Group ExpiryDay" = Table.Group(#"Changed Type",{"ExpiryDay"},{
{"Totals by ExpiryDay", each List.Sum([Volume]), Int64.Type}
}),
#"Sorted ExpiryDay" = Table.Sort(#"Group ExpiryDay",{{"ExpiryDay", Order.Ascending}}),
//Add Running Total Column by DataDay
#"RT DataDay" = fnRT(#"Sorted DataDay","Totals by DataDay"),
#"RT ExpiryDay" = fnRT(#"Sorted ExpiryDay", "Totals by ExpiryDay"),
//Subtract ExpiryDays prior or equal to dataday
#"Corr RT" = Table.AddColumn(#"RT DataDay", "Corrected Running Total",
(c)=> c[Totals by DataDay Running Total] -
(try Table.Last(Table.SelectRows(#"RT ExpiryDay", each [ExpiryDay] <
c[DataDay]))[Totals by ExpiryDay Running Total] otherwise 0), Int64.Type),
#"Removed Columns" = Table.RemoveColumns(#"Corr RT",{"Totals by DataDay", "Totals by DataDay Running Total"})
in
#"Removed Columns"
Data
Results
Hi, thanks again for yor reply!! Can you help me how to do this?
Hi ronrsnfld, thank you very much for your reply, I will be able to check your solution proposal on Monday, but I see that the final result is correct 🙂
Hi,
I don't fully grasp your required outcome, could you elaborate.
To compute a running total in Power Query you can make a regular one with List.Generate: https://gorilla.bi/power-query/running-total/
That looks something like this:
List.Generate (
// Start value: Set RT to 1st value in BuffValues and set RowIndex to 0
() => [ RT = BuffValues{0}, RowIndex = 0 ],
// Condition: Generate values as long as RowIndex is < than number of list items
each [RowIndex] < List.Count( BuffValues ),
// Record with logic for next values:
each [
// Calculate running total by adding current RT value and the next value
RT = List.Sum( { [RT], BuffValues{[RowIndex] + 1} } ),
// Increment the RowIndex by 1 for the next iteration
RowIndex = [RowIndex] + 1
],
// Only return the running total (RT) value for each row
each [RT]
)
And you can also create a grouped running total like here: https://gorilla.bi/power-query/running-total-by-category/
--------------------------------------------------
@ me in replies or I'll lose your thread
Master Power Query M? -> https://powerquery.how
Read in-depth articles? -> BI Gorilla
Youtube Channel: BI Gorilla
If this post helps, then please consider accepting it as the solution to help other members find it more quickly.
Hi, I will try to explain exactly what I want to do. Here are my data (I have 3 million rows), this is just a sample:
I want to calculate a cumulative amount for a volume, but so that the calculation skips values with expiry date < dataday. I get the cumulative amount grouped by dataday:
But I need 71 for June 9, not 91, because there are two expiry dates earlier than June 9 in the previous rows (volume 10 and 10), these should not be included in the cumulative value for June 9.
My problem is that I need to calculate the cumulative amount for more than 1 million rows (and at the same time skip the overdue amounts) and therefore I can't solve the problem with a DAX function.
Hi, @streli
PQ is not the ideal tool for these kinds of calculations. But as you are interested in PQ so here is what you need to do.
Sort your "dataday" and "expiry date" in ascending order.
Add an index column: Add Column > Index Column > From 1.
Create a new custom column with the following formula:
= List.Sum(Table.SelectRows(MyTable, each [Index] >= _[Index] and [expiry date] >= _[expiry date]) [volume])
Rename the new column as "Cumulative Volume".
Apply the changes and load the data into Power BI.
Please note this can be quite slow on larger data sets, as it's essentially a row-by-row operation. If you find that this method is too slow, you may need to consider a different tool or method, such as using SQL.
Proud to be a Super User!
@streli- Check this out, if this works for you.
Cumulative Amount =
CALCULATE(
SUM('Table'[Volumen]),
FILTER(
ALL('Table'),
'Table'[ExpiryDate] > MAX('Table'[DataDay])
)
)
Hi, @streli
Unfortunately, Power Query Editor doesn't support running totals directly. However, we can do it with Power Query Editor for cleaning and filtering the data, and then DAX for the cumulative sum calculation. Here's a step-by-step guide.
Load and clean data in Power Query Editor, making sure the dataday, expiry date, and volume columns are correctly formatted.
Filter rows in Power Query Editor where the expiry date is earlier than the dataday.
Close & Apply to load the filtered data into Power BI.
Create a new DAX measure in Power BI to calculate the cumulative sum:
Cumulative Volume =
CALCULATE (
SUM ('Table'[Volume]),
FILTER (
ALLSELECTED ('Table'[dataday]),
'Table'[dataday] <= MAX ('Table'[dataday])
)
)
This will give you a cumulative sum that excludes datadays with expiry dates earlier than the dataday.
Proud to be a Super User!