cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Highlighted
gdssiqueira Regular Visitor
Regular Visitor

Calculate difference from latest value

Hi guys,

 

this one is a variation of this post. Since I can't mark it as 'unsolved', it made sense to start a new topic.

 

In the previous post, I have an example, in which I wanted to subtract two values from different rows when they had a different type (there's a Type column) and referred to the same date. Now I'm going to present a different example because the goal is a little different.

 

Here's the simplified version of my dataset. The Selection column actually represents a few columns grouped into one.

The table is not necessarily ordered by latest date.

 

Date Type Value Selection

1/6    A      4           X

1/5    A      2           X

1/4    B      3           X

1/3    A      5           Y

1/3    B      2           Y

1/2    B      3           X

1/1    A     3           Y

1/1   B      1           Y

 

What I want is, for every row of type A, calculate the difference between its value and the value of the last row of type B, with a Date equal or prior to its Date and of the same Selection. I'll try to show the output I'm expecting:

Date Type Value Selection  Difference

1/6    A       4          X             4 - 3 =1 (from row 3)

1/5    A       2          X             2 - 3 = -1  (from row 3)

1/4    B       3          X             blank

1/3    A       5          Y             5 - 3 = 2 (from row 6, because row 5 has different Selection value)

1/3    B       2          X             blank

1/2    B       3          Y            blank

1/1    A       3         Y            3-1 = 2

1/1    B      1          Y            blank

 

How could I approach this? Aditionally, if there's another column with a true/false value saying if you should ignore that row, how would that change my approach? Feel free to message me with questions if something isn't clear enough.

 

Thanks ! 🙂

1 ACCEPTED SOLUTION

Accepted Solutions
Super User
Super User

Re: Calculate difference from latest value

This is one possibility:

 

let

// This function selects the corresponding B-Row for every A value
SelectB = (selection, date) =>

let
// several filters following
// omitt this first filter if you don't have the true/false-field #"Filtered Rows3" = Table.SelectRows(ChgType, each ([#"true/false"] = "true")), // without the true/false-field, the next filter will look like this:
// #"Filtered Rows" = Table.SelectRows(ChgType, each ([Selection] = selection)),
#"Filtered Rows" = Table.SelectRows(#"Filtered Rows3", each ([Selection] = selection)), #"Filtered Rows1" = Table.SelectRows(#"Filtered Rows", each [Date] <= date), #"Filtered Rows2" = Table.SelectRows(#"Filtered Rows1", each ([Type] = "B")), // Just keep the latest row from this selection
#"Sorted Rows" = Table.Buffer(Table.Sort(#"Filtered Rows2",{{"Date", Order.Descending}})), #"Kept First Rows" = Table.FirstN(#"Sorted Rows",1) in #"Kept First Rows", // Example data - replace with your table Source1 = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjDUMzDTMzIwNFPSUVJwBBImIIZCBJAsKSpNVYrVgSgyRVFkhF2RCVyRE5Awxq7IGMUkU7CiSNyKnODWgRSlJeYUI1QZEWOfIYp9xtjtgysCGWSIpiYWAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Date = _t, Type = _t, Value = _t, Selection = _t, #"true/false" = _t]), ChgType = Table.Buffer(Table.TransformColumnTypes(Source1,{{"Date", type date}, {"Type", type text}, {"Value", Int64.Type}, {"Selection", type text}})), // Here the function is called #"Added Custom" = Table.AddColumn(ChgType, "BValue", each SelectB([Selection], [Date])), #"Expanded BValue" = Table.ExpandTableColumn(#"Added Custom", "BValue", {"Value"}, {"ValueB"}), #"Added Custom1" = Table.AddColumn(#"Expanded BValue", "Difference", each if [Type]="A" then [Value]-[ValueB] else "") in #"Added Custom1"

If you liked my solution, please give it a thumbs up. An if I did answer your question, please mark my post as a solution. Thanks!

Proud to be a Datanaut!

Imke Feldmann

How to integrate M-code into your solution -- Check out more PBI- learning resources here -- Performance Tipps for M-queries




View solution in original post

12 REPLIES 12
Super User
Super User

Re: Calculate difference from latest value

This is one possibility:

 

let

// This function selects the corresponding B-Row for every A value
SelectB = (selection, date) =>

let
// several filters following
// omitt this first filter if you don't have the true/false-field #"Filtered Rows3" = Table.SelectRows(ChgType, each ([#"true/false"] = "true")), // without the true/false-field, the next filter will look like this:
// #"Filtered Rows" = Table.SelectRows(ChgType, each ([Selection] = selection)),
#"Filtered Rows" = Table.SelectRows(#"Filtered Rows3", each ([Selection] = selection)), #"Filtered Rows1" = Table.SelectRows(#"Filtered Rows", each [Date] <= date), #"Filtered Rows2" = Table.SelectRows(#"Filtered Rows1", each ([Type] = "B")), // Just keep the latest row from this selection
#"Sorted Rows" = Table.Buffer(Table.Sort(#"Filtered Rows2",{{"Date", Order.Descending}})), #"Kept First Rows" = Table.FirstN(#"Sorted Rows",1) in #"Kept First Rows", // Example data - replace with your table Source1 = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjDUMzDTMzIwNFPSUVJwBBImIIZCBJAsKSpNVYrVgSgyRVFkhF2RCVyRE5Awxq7IGMUkU7CiSNyKnODWgRSlJeYUI1QZEWOfIYp9xtjtgysCGWSIpiYWAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Date = _t, Type = _t, Value = _t, Selection = _t, #"true/false" = _t]), ChgType = Table.Buffer(Table.TransformColumnTypes(Source1,{{"Date", type date}, {"Type", type text}, {"Value", Int64.Type}, {"Selection", type text}})), // Here the function is called #"Added Custom" = Table.AddColumn(ChgType, "BValue", each SelectB([Selection], [Date])), #"Expanded BValue" = Table.ExpandTableColumn(#"Added Custom", "BValue", {"Value"}, {"ValueB"}), #"Added Custom1" = Table.AddColumn(#"Expanded BValue", "Difference", each if [Type]="A" then [Value]-[ValueB] else "") in #"Added Custom1"

If you liked my solution, please give it a thumbs up. An if I did answer your question, please mark my post as a solution. Thanks!

Proud to be a Datanaut!

Imke Feldmann

How to integrate M-code into your solution -- Check out more PBI- learning resources here -- Performance Tipps for M-queries




View solution in original post

gdssiqueira Regular Visitor
Regular Visitor

Re: Calculate difference from latest value

Thanks for taking the time and replying. I'm going to give it a try on Monday and come back with the results. I really appreciate your help!

gdssiqueira Regular Visitor
Regular Visitor

Re: Calculate difference from latest value

@ImkeF

 

I believe I managed to adapt this to my code, as per your instructions. I've added a few more filters, since - as I said - selection is not an actual field, but several fields that I had to filter from.

 

I'm waiting for it to finish, but it seems like it will take several hours to complete, which is worrisome. It's processed 254 rows in 3 minutes, out of 260.000 in my dataset. Here's what my code looks like - I've changed the names to hide sensitive info, but the structure is the same.

let

// This function selects the corresponding B-Row for every A value
SelectB = (d, e, f, g, date) =>

let
// several filters following
// this is actually the true/false column. It's a custom column that is true if [Date] = [T] #"Filtered Rows2" = Table.SelectRows(Source1, each ([Date] = [T])), #"Filtered Rows4" = Table.SelectRows(#"Filtered Rows2", each ([D] = d)), #"Filtered Rows5" = Table.SelectRows(#"Filtered Rows4", each ([E] = e)), #"Filtered Rows6" = Table.SelectRows(#"Filtered Rows5", each ([F] = f)), #"Filtered Rows" = Table.SelectRows(#"Filtered Rows3", each ([G] = g)), #"Filtered Rows1" = Table.SelectRows(#"Filtered Rows", each [Date] <= date), #"Filtered Rows2" = Table.SelectRows(#"Filtered Rows1", each ([Type] = "B")), // Just keep the latest row from this selection #"Sorted Rows" = Table.Buffer(Table.Sort(#"Filtered Rows6",{{"Date", Order.Descending}})), #"Kept First Rows" = Table.FirstN(#"Sorted Rows",1) in #"Kept First Rows", // Example data - replace with your table Source1 = MyTable, // Here the function is called #"Added Custom" = Table.AddColumn(Source1, "BValue", each SelectB([D], [E], [F], [G], [Date])), #"Expanded BValue" = Table.ExpandTableColumn(#"Added Custom", "BValue", {"Value"}, {"ValueB"}), #"Added Custom1" = Table.AddColumn(#"Expanded BValue", "Difference", each if [Type]="A" then [Value]-[ValueB] else "") in #"Added Custom1"

 I understand as it does a lot of filtering for each of the rows, it's expected to be slow, but I think I'll have to think of something else if this has to run a few hours every day (I have daily refresh on my dataset)

Super User
Super User

Re: Calculate difference from latest value

Sorry, always forget to place a Buffer in order to prevent multiple calls.

Try to change your Source1-step to:

 

Source1 = Table.Buffer(MyTable),

 

 

If you liked my solution, please give it a thumbs up. An if I did answer your question, please mark my post as a solution. Thanks!

Proud to be a Datanaut!

Imke Feldmann

How to integrate M-code into your solution -- Check out more PBI- learning resources here -- Performance Tipps for M-queries




gdssiqueira Regular Visitor
Regular Visitor

Re: Calculate difference from latest value

@ImkeF

 

Well, that does help! But unfortunately, it's still slow. The 'preview' has been running for a few minutes and nothing. I haven't been able to get a sense on how long it's actually going to take. PowerBI doesn't give you a whole lot of feedback on what it's doing...

 

Anyway, I'm thinking of creating an indexed view on SQL Server that does the calculations and then just read from the view on PowerBI. Does that sound reasonable? Would that query be ok to do in SQL?

 

Thanks 🙂

Super User
Super User

Re: Calculate difference from latest value

Sorry, I'm obviously a little bit slow on this one. There should be a substantial improvement in performance, if we "partition" the table before calling the function:

 

// Example data - replace with your table     
    Source1 = MyTable,
// Here the function is called     
  #"Grouped Rows" = Table.Group(Source1, {"D", "E", "F", "G"}, {{"Data", each Table.AddColumn(_, "BValue", each SelectB([Selection], [Date])), type table}}),
    #"Expanded Data" = Table.ExpandTableColumn(#"Grouped Rows", "Data", {"BValue", "Date", "true/false", "Type", "Value"}, {"BValue", "Date", "true/false", "Type", "Value"}),
    #"Expanded BValue" = Table.ExpandTableColumn(#"Expanded Data", "BValue", {"Value"}, {"ValueB"}),
    #"Added Custom1" = Table.AddColumn(#"Expanded BValue", "Difference", each if [Type]="A" then [Value]-[ValueB] else "") 
in     
    #"Added Custom1"

This is done by the Grouped Rows-step.

 

 

However, the view at the server might be the best idea.

 

Having a look at your filters in the function itself I was wondering about the true/false-filter, as it seems to contradict the logic you've described: As it actually reduced the returned rows to 1 date, there will be no dates before, hence no need to filter for the latest value and so on.

If you liked my solution, please give it a thumbs up. An if I did answer your question, please mark my post as a solution. Thanks!

Proud to be a Datanaut!

Imke Feldmann

How to integrate M-code into your solution -- Check out more PBI- learning resources here -- Performance Tipps for M-queries




gdssiqueira Regular Visitor
Regular Visitor

Re: Calculate difference from latest value

@ImkeF

Nice catch on that date part, that's obviously wrong. There's a timestamp field and a field with the latest timestamp for that specific category - that's because some records are updated with different values, but I want to keep the historical data, so they are added and can be told apart with the timestamp. Now, I only want to use the records with the latest timestamp amongst their category (category/selection is a set of columns). The way I handled that was to write the max timestamp for its category (using group by) in each record as an extra column. If timestamp = latest timestamp (this is the true/false logic), this is the latest record for that category, thus the one we want to use.

To sum all of this up, the true/false line doesn't use Date but Timestamp. 

 

I'm going to try your modifications in a few minutes, before trying the view alternative.

Microsoft arify
Microsoft

Re: Calculate difference from latest value

What's your datasource? There might be a smart native query you can write to make it faster. (Like using LAG function for SQL Server https://msdn.microsoft.com/en-us/library/hh231256.aspx )

gdssiqueira Regular Visitor
Regular Visitor

Re: Calculate difference from latest value

@arify

 

Thanks for joining 🙂

My datasource is SQL Azure (which is basically SQL Server).

Helpful resources

Announcements
New Ranks and Rank Icons in 2020

New Ranks and Rank Icons in 2020

Read the announcement for more information!

New Kudos Given Badges Coming

New Kudos Given Badges Coming

We're rolling out new Kudos Given badges. Find out how many Kudos you've given.

November 2019 Community Highlights

November 2019 Community Highlights

Get an overview of the events and great community content from November.

Power Platform World Tour

Power Platform World Tour

Find out where you can attend!

Top Solution Authors
Top Kudoed Authors (Last 30 Days)