cancel
Showing results for
Did you mean:
Highlighted
gdssiqueira Regular Visitor

## Calculate difference from latest value

Hi guys,

this one is a variation of this post. Since I can't mark it as 'unsolved', it made sense to start a new topic.

In the previous post, I have an example, in which I wanted to subtract two values from different rows when they had a different type (there's a Type column) and referred to the same date. Now I'm going to present a different example because the goal is a little different.

Here's the simplified version of my dataset. The Selection column actually represents a few columns grouped into one.

The table is not necessarily ordered by latest date.

Date Type Value Selection

1/6    A      4           X

1/5    A      2           X

1/4    B      3           X

1/3    A      5           Y

1/3    B      2           Y

1/2    B      3           X

1/1    A     3           Y

1/1   B      1           Y

What I want is, for every row of type A, calculate the difference between its value and the value of the last row of type B, with a Date equal or prior to its Date and of the same Selection. I'll try to show the output I'm expecting:

Date Type Value Selection  Difference

1/6    A       4          X             4 - 3 =1 (from row 3)

1/5    A       2          X             2 - 3 = -1  (from row 3)

1/4    B       3          X             blank

1/3    A       5          Y             5 - 3 = 2 (from row 6, because row 5 has different Selection value)

1/3    B       2          X             blank

1/2    B       3          Y            blank

1/1    A       3         Y            3-1 = 2

1/1    B      1          Y            blank

How could I approach this? Aditionally, if there's another column with a true/false value saying if you should ignore that row, how would that change my approach? Feel free to message me with questions if something isn't clear enough.

Thanks ! 🙂

1 ACCEPTED SOLUTION

Accepted Solutions Super User

## Re: Calculate difference from latest value

This is one possibility:

```let

// This function selects the corresponding B-Row for every A value
SelectB = (selection, date) =>

let
// several filters following// omitt this first filter if you don't have the true/false-field
#"Filtered Rows3" = Table.SelectRows(ChgType, each ([#"true/false"] = "true")),
// without the true/false-field, the next filter will look like this:// #"Filtered Rows" = Table.SelectRows(ChgType, each ([Selection] = selection)),    #"Filtered Rows" = Table.SelectRows(#"Filtered Rows3", each ([Selection] = selection)),
#"Filtered Rows1" = Table.SelectRows(#"Filtered Rows", each [Date] <= date),
#"Filtered Rows2" = Table.SelectRows(#"Filtered Rows1", each ([Type] = "B")),
// Just keep the latest row from this selection     #"Sorted Rows" = Table.Buffer(Table.Sort(#"Filtered Rows2",{{"Date", Order.Descending}})),
#"Kept First Rows" = Table.FirstN(#"Sorted Rows",1)
in
#"Kept First Rows",

// Example data - replace with your table
Source1 = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjDUMzDTMzIwNFPSUVJwBBImIIZCBJAsKSpNVYrVgSgyRVFkhF2RCVyRE5Awxq7IGMUkU7CiSNyKnODWgRSlJeYUI1QZEWOfIYp9xtjtgysCGWSIpiYWAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Date = _t, Type = _t, Value = _t, Selection = _t, #"true/false" = _t]),
ChgType = Table.Buffer(Table.TransformColumnTypes(Source1,{{"Date", type date}, {"Type", type text}, {"Value", Int64.Type}, {"Selection", type text}})),
// Here the function is called
#"Expanded BValue" = Table.ExpandTableColumn(#"Added Custom", "BValue", {"Value"}, {"ValueB"}),
#"Added Custom1" = Table.AddColumn(#"Expanded BValue", "Difference", each if [Type]="A" then [Value]-[ValueB] else "")
in

If you liked my solution, please give it a thumbs up. An if I did answer your question, please mark my post as a solution. Thanks!

Proud to be a Datanaut!

Imke Feldmann

12 REPLIES 12 Super User

## Re: Calculate difference from latest value

This is one possibility:

```let

// This function selects the corresponding B-Row for every A value
SelectB = (selection, date) =>

let
// several filters following// omitt this first filter if you don't have the true/false-field
#"Filtered Rows3" = Table.SelectRows(ChgType, each ([#"true/false"] = "true")),
// without the true/false-field, the next filter will look like this:// #"Filtered Rows" = Table.SelectRows(ChgType, each ([Selection] = selection)),    #"Filtered Rows" = Table.SelectRows(#"Filtered Rows3", each ([Selection] = selection)),
#"Filtered Rows1" = Table.SelectRows(#"Filtered Rows", each [Date] <= date),
#"Filtered Rows2" = Table.SelectRows(#"Filtered Rows1", each ([Type] = "B")),
// Just keep the latest row from this selection     #"Sorted Rows" = Table.Buffer(Table.Sort(#"Filtered Rows2",{{"Date", Order.Descending}})),
#"Kept First Rows" = Table.FirstN(#"Sorted Rows",1)
in
#"Kept First Rows",

// Example data - replace with your table
Source1 = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjDUMzDTMzIwNFPSUVJwBBImIIZCBJAsKSpNVYrVgSgyRVFkhF2RCVyRE5Awxq7IGMUkU7CiSNyKnODWgRSlJeYUI1QZEWOfIYp9xtjtgysCGWSIpiYWAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Date = _t, Type = _t, Value = _t, Selection = _t, #"true/false" = _t]),
ChgType = Table.Buffer(Table.TransformColumnTypes(Source1,{{"Date", type date}, {"Type", type text}, {"Value", Int64.Type}, {"Selection", type text}})),
// Here the function is called
#"Expanded BValue" = Table.ExpandTableColumn(#"Added Custom", "BValue", {"Value"}, {"ValueB"}),
#"Added Custom1" = Table.AddColumn(#"Expanded BValue", "Difference", each if [Type]="A" then [Value]-[ValueB] else "")
in

If you liked my solution, please give it a thumbs up. An if I did answer your question, please mark my post as a solution. Thanks!

Proud to be a Datanaut!

Imke Feldmann

gdssiqueira Regular Visitor

## Re: Calculate difference from latest value

Thanks for taking the time and replying. I'm going to give it a try on Monday and come back with the results. I really appreciate your help!

gdssiqueira Regular Visitor

## Re: Calculate difference from latest value

@ImkeF

I believe I managed to adapt this to my code, as per your instructions. I've added a few more filters, since - as I said - selection is not an actual field, but several fields that I had to filter from.

I'm waiting for it to finish, but it seems like it will take several hours to complete, which is worrisome. It's processed 254 rows in 3 minutes, out of 260.000 in my dataset. Here's what my code looks like - I've changed the names to hide sensitive info, but the structure is the same.

```let

// This function selects the corresponding B-Row for every A value
SelectB = (d, e, f, g, date) =>

let
// several filters following// this is actually the true/false column. It's a custom column that is true if [Date] = [T]
#"Filtered Rows2" = Table.SelectRows(Source1, each ([Date] = [T])),
#"Filtered Rows4" = Table.SelectRows(#"Filtered Rows2", each ([D] = d)),
#"Filtered Rows5" = Table.SelectRows(#"Filtered Rows4", each ([E] = e)),
#"Filtered Rows6" = Table.SelectRows(#"Filtered Rows5", each ([F] = f)),
#"Filtered Rows" = Table.SelectRows(#"Filtered Rows3", each ([G] = g)),
#"Filtered Rows1" = Table.SelectRows(#"Filtered Rows", each [Date] <= date),
#"Filtered Rows2" = Table.SelectRows(#"Filtered Rows1", each ([Type] = "B")),
// Just keep the latest row from this selection
#"Sorted Rows" = Table.Buffer(Table.Sort(#"Filtered Rows6",{{"Date", Order.Descending}})),
#"Kept First Rows" = Table.FirstN(#"Sorted Rows",1)
in
#"Kept First Rows",

// Example data - replace with your table
Source1 = MyTable,
// Here the function is called
#"Expanded BValue" = Table.ExpandTableColumn(#"Added Custom", "BValue", {"Value"}, {"ValueB"}),
#"Added Custom1" = Table.AddColumn(#"Expanded BValue", "Difference", each if [Type]="A" then [Value]-[ValueB] else "")
in

I understand as it does a lot of filtering for each of the rows, it's expected to be slow, but I think I'll have to think of something else if this has to run a few hours every day (I have daily refresh on my dataset) Super User

## Re: Calculate difference from latest value

Sorry, always forget to place a Buffer in order to prevent multiple calls.

Try to change your Source1-step to:

Source1 = Table.Buffer(MyTable),

If you liked my solution, please give it a thumbs up. An if I did answer your question, please mark my post as a solution. Thanks!

Proud to be a Datanaut!

Imke Feldmann

gdssiqueira Regular Visitor

## Re: Calculate difference from latest value

@ImkeF

Well, that does help! But unfortunately, it's still slow. The 'preview' has been running for a few minutes and nothing. I haven't been able to get a sense on how long it's actually going to take. PowerBI doesn't give you a whole lot of feedback on what it's doing...

Anyway, I'm thinking of creating an indexed view on SQL Server that does the calculations and then just read from the view on PowerBI. Does that sound reasonable? Would that query be ok to do in SQL?

Thanks 🙂 Super User

## Re: Calculate difference from latest value

Sorry, I'm obviously a little bit slow on this one. There should be a substantial improvement in performance, if we "partition" the table before calling the function:

```// Example data - replace with your table
Source1 = MyTable,
// Here the function is called
#"Grouped Rows" = Table.Group(Source1, {"D", "E", "F", "G"}, {{"Data", each Table.AddColumn(_, "BValue", each SelectB([Selection], [Date])), type table}}),
#"Expanded Data" = Table.ExpandTableColumn(#"Grouped Rows", "Data", {"BValue", "Date", "true/false", "Type", "Value"}, {"BValue", "Date", "true/false", "Type", "Value"}),
#"Expanded BValue" = Table.ExpandTableColumn(#"Expanded Data", "BValue", {"Value"}, {"ValueB"}),
#"Added Custom1" = Table.AddColumn(#"Expanded BValue", "Difference", each if [Type]="A" then [Value]-[ValueB] else "")
in

This is done by the Grouped Rows-step.

However, the view at the server might be the best idea.

Having a look at your filters in the function itself I was wondering about the true/false-filter, as it seems to contradict the logic you've described: As it actually reduced the returned rows to 1 date, there will be no dates before, hence no need to filter for the latest value and so on.

If you liked my solution, please give it a thumbs up. An if I did answer your question, please mark my post as a solution. Thanks!

Proud to be a Datanaut!

Imke Feldmann

gdssiqueira Regular Visitor

## Re: Calculate difference from latest value

@ImkeF

Nice catch on that date part, that's obviously wrong. There's a timestamp field and a field with the latest timestamp for that specific category - that's because some records are updated with different values, but I want to keep the historical data, so they are added and can be told apart with the timestamp. Now, I only want to use the records with the latest timestamp amongst their category (category/selection is a set of columns). The way I handled that was to write the max timestamp for its category (using group by) in each record as an extra column. If timestamp = latest timestamp (this is the true/false logic), this is the latest record for that category, thus the one we want to use.

To sum all of this up, the true/false line doesn't use Date but Timestamp.

I'm going to try your modifications in a few minutes, before trying the view alternative. arify
Microsoft

## Re: Calculate difference from latest value

What's your datasource? There might be a smart native query you can write to make it faster. (Like using LAG function for SQL Server https://msdn.microsoft.com/en-us/library/hh231256.aspx )

gdssiqueira Regular Visitor

## Re: Calculate difference from latest value

@arify

Thanks for joining 🙂

My datasource is SQL Azure (which is basically SQL Server). 