Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
gdssiqueira
Helper I
Helper I

Calculate difference from latest value

Hi guys,

 

this one is a variation of this post. Since I can't mark it as 'unsolved', it made sense to start a new topic.

 

In the previous post, I have an example, in which I wanted to subtract two values from different rows when they had a different type (there's a Type column) and referred to the same date. Now I'm going to present a different example because the goal is a little different.

 

Here's the simplified version of my dataset. The Selection column actually represents a few columns grouped into one.

The table is not necessarily ordered by latest date.

 

Date Type Value Selection

1/6    A      4           X

1/5    A      2           X

1/4    B      3           X

1/3    A      5           Y

1/3    B      2           Y

1/2    B      3           X

1/1    A     3           Y

1/1   B      1           Y

 

What I want is, for every row of type A, calculate the difference between its value and the value of the last row of type B, with a Date equal or prior to its Date and of the same Selection. I'll try to show the output I'm expecting:

Date Type Value Selection  Difference

1/6    A       4          X             4 - 3 =1 (from row 3)

1/5    A       2          X             2 - 3 = -1  (from row 3)

1/4    B       3          X             blank

1/3    A       5          Y             5 - 3 = 2 (from row 6, because row 5 has different Selection value)

1/3    B       2          X             blank

1/2    B       3          Y            blank

1/1    A       3         Y            3-1 = 2

1/1    B      1          Y            blank

 

How could I approach this? Aditionally, if there's another column with a true/false value saying if you should ignore that row, how would that change my approach? Feel free to message me with questions if something isn't clear enough.

 

Thanks ! 🙂

1 ACCEPTED SOLUTION
ImkeF
Super User
Super User

This is one possibility:

 

let

// This function selects the corresponding B-Row for every A value
SelectB = (selection, date) =>

let
// several filters following
// omitt this first filter if you don't have the true/false-field #"Filtered Rows3" = Table.SelectRows(ChgType, each ([#"true/false"] = "true")), // without the true/false-field, the next filter will look like this:
// #"Filtered Rows" = Table.SelectRows(ChgType, each ([Selection] = selection)),
#"Filtered Rows" = Table.SelectRows(#"Filtered Rows3", each ([Selection] = selection)), #"Filtered Rows1" = Table.SelectRows(#"Filtered Rows", each [Date] <= date), #"Filtered Rows2" = Table.SelectRows(#"Filtered Rows1", each ([Type] = "B")), // Just keep the latest row from this selection
#"Sorted Rows" = Table.Buffer(Table.Sort(#"Filtered Rows2",{{"Date", Order.Descending}})), #"Kept First Rows" = Table.FirstN(#"Sorted Rows",1) in #"Kept First Rows", // Example data - replace with your table Source1 = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjDUMzDTMzIwNFPSUVJwBBImIIZCBJAsKSpNVYrVgSgyRVFkhF2RCVyRE5Awxq7IGMUkU7CiSNyKnODWgRSlJeYUI1QZEWOfIYp9xtjtgysCGWSIpiYWAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Date = _t, Type = _t, Value = _t, Selection = _t, #"true/false" = _t]), ChgType = Table.Buffer(Table.TransformColumnTypes(Source1,{{"Date", type date}, {"Type", type text}, {"Value", Int64.Type}, {"Selection", type text}})), // Here the function is called #"Added Custom" = Table.AddColumn(ChgType, "BValue", each SelectB([Selection], [Date])), #"Expanded BValue" = Table.ExpandTableColumn(#"Added Custom", "BValue", {"Value"}, {"ValueB"}), #"Added Custom1" = Table.AddColumn(#"Expanded BValue", "Difference", each if [Type]="A" then [Value]-[ValueB] else "") in #"Added Custom1"

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

View solution in original post

12 REPLIES 12
ImkeF
Super User
Super User

This is one possibility:

 

let

// This function selects the corresponding B-Row for every A value
SelectB = (selection, date) =>

let
// several filters following
// omitt this first filter if you don't have the true/false-field #"Filtered Rows3" = Table.SelectRows(ChgType, each ([#"true/false"] = "true")), // without the true/false-field, the next filter will look like this:
// #"Filtered Rows" = Table.SelectRows(ChgType, each ([Selection] = selection)),
#"Filtered Rows" = Table.SelectRows(#"Filtered Rows3", each ([Selection] = selection)), #"Filtered Rows1" = Table.SelectRows(#"Filtered Rows", each [Date] <= date), #"Filtered Rows2" = Table.SelectRows(#"Filtered Rows1", each ([Type] = "B")), // Just keep the latest row from this selection
#"Sorted Rows" = Table.Buffer(Table.Sort(#"Filtered Rows2",{{"Date", Order.Descending}})), #"Kept First Rows" = Table.FirstN(#"Sorted Rows",1) in #"Kept First Rows", // Example data - replace with your table Source1 = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjDUMzDTMzIwNFPSUVJwBBImIIZCBJAsKSpNVYrVgSgyRVFkhF2RCVyRE5Awxq7IGMUkU7CiSNyKnODWgRSlJeYUI1QZEWOfIYp9xtjtgysCGWSIpiYWAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Date = _t, Type = _t, Value = _t, Selection = _t, #"true/false" = _t]), ChgType = Table.Buffer(Table.TransformColumnTypes(Source1,{{"Date", type date}, {"Type", type text}, {"Value", Int64.Type}, {"Selection", type text}})), // Here the function is called #"Added Custom" = Table.AddColumn(ChgType, "BValue", each SelectB([Selection], [Date])), #"Expanded BValue" = Table.ExpandTableColumn(#"Added Custom", "BValue", {"Value"}, {"ValueB"}), #"Added Custom1" = Table.AddColumn(#"Expanded BValue", "Difference", each if [Type]="A" then [Value]-[ValueB] else "") in #"Added Custom1"

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

@ImkeF

 

I believe I managed to adapt this to my code, as per your instructions. I've added a few more filters, since - as I said - selection is not an actual field, but several fields that I had to filter from.

 

I'm waiting for it to finish, but it seems like it will take several hours to complete, which is worrisome. It's processed 254 rows in 3 minutes, out of 260.000 in my dataset. Here's what my code looks like - I've changed the names to hide sensitive info, but the structure is the same.

let

// This function selects the corresponding B-Row for every A value
SelectB = (d, e, f, g, date) =>

let
// several filters following
// this is actually the true/false column. It's a custom column that is true if [Date] = [T] #"Filtered Rows2" = Table.SelectRows(Source1, each ([Date] = [T])), #"Filtered Rows4" = Table.SelectRows(#"Filtered Rows2", each ([D] = d)), #"Filtered Rows5" = Table.SelectRows(#"Filtered Rows4", each ([E] = e)), #"Filtered Rows6" = Table.SelectRows(#"Filtered Rows5", each ([F] = f)), #"Filtered Rows" = Table.SelectRows(#"Filtered Rows3", each ([G] = g)), #"Filtered Rows1" = Table.SelectRows(#"Filtered Rows", each [Date] <= date), #"Filtered Rows2" = Table.SelectRows(#"Filtered Rows1", each ([Type] = "B")), // Just keep the latest row from this selection #"Sorted Rows" = Table.Buffer(Table.Sort(#"Filtered Rows6",{{"Date", Order.Descending}})), #"Kept First Rows" = Table.FirstN(#"Sorted Rows",1) in #"Kept First Rows", // Example data - replace with your table Source1 = MyTable, // Here the function is called #"Added Custom" = Table.AddColumn(Source1, "BValue", each SelectB([D], [E], [F], [G], [Date])), #"Expanded BValue" = Table.ExpandTableColumn(#"Added Custom", "BValue", {"Value"}, {"ValueB"}), #"Added Custom1" = Table.AddColumn(#"Expanded BValue", "Difference", each if [Type]="A" then [Value]-[ValueB] else "") in #"Added Custom1"

 I understand as it does a lot of filtering for each of the rows, it's expected to be slow, but I think I'll have to think of something else if this has to run a few hours every day (I have daily refresh on my dataset)

Sorry, always forget to place a Buffer in order to prevent multiple calls.

Try to change your Source1-step to:

 

Source1 = Table.Buffer(MyTable),

 

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

@ImkeF

 

Well, that does help! But unfortunately, it's still slow. The 'preview' has been running for a few minutes and nothing. I haven't been able to get a sense on how long it's actually going to take. PowerBI doesn't give you a whole lot of feedback on what it's doing...

 

Anyway, I'm thinking of creating an indexed view on SQL Server that does the calculations and then just read from the view on PowerBI. Does that sound reasonable? Would that query be ok to do in SQL?

 

Thanks 🙂

Sorry, I'm obviously a little bit slow on this one. There should be a substantial improvement in performance, if we "partition" the table before calling the function:

 

// Example data - replace with your table     
    Source1 = MyTable,
// Here the function is called     
  #"Grouped Rows" = Table.Group(Source1, {"D", "E", "F", "G"}, {{"Data", each Table.AddColumn(_, "BValue", each SelectB([Selection], [Date])), type table}}),
    #"Expanded Data" = Table.ExpandTableColumn(#"Grouped Rows", "Data", {"BValue", "Date", "true/false", "Type", "Value"}, {"BValue", "Date", "true/false", "Type", "Value"}),
    #"Expanded BValue" = Table.ExpandTableColumn(#"Expanded Data", "BValue", {"Value"}, {"ValueB"}),
    #"Added Custom1" = Table.AddColumn(#"Expanded BValue", "Difference", each if [Type]="A" then [Value]-[ValueB] else "") 
in     
    #"Added Custom1"

This is done by the Grouped Rows-step.

 

 

However, the view at the server might be the best idea.

 

Having a look at your filters in the function itself I was wondering about the true/false-filter, as it seems to contradict the logic you've described: As it actually reduced the returned rows to 1 date, there will be no dates before, hence no need to filter for the latest value and so on.

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

@ImkeF

Nice catch on that date part, that's obviously wrong. There's a timestamp field and a field with the latest timestamp for that specific category - that's because some records are updated with different values, but I want to keep the historical data, so they are added and can be told apart with the timestamp. Now, I only want to use the records with the latest timestamp amongst their category (category/selection is a set of columns). The way I handled that was to write the max timestamp for its category (using group by) in each record as an extra column. If timestamp = latest timestamp (this is the true/false logic), this is the latest record for that category, thus the one we want to use.

To sum all of this up, the true/false line doesn't use Date but Timestamp. 

 

I'm going to try your modifications in a few minutes, before trying the view alternative.

What's your datasource? There might be a smart native query you can write to make it faster. (Like using LAG function for SQL Server https://msdn.microsoft.com/en-us/library/hh231256.aspx )

@arify

 

Thanks for joining 🙂

My datasource is SQL Azure (which is basically SQL Server).

Sorry, I didn't fully read your question. LAG wouldn't help, this looks pretty complicated.

@arify

 

Yeah... but knowing it will be helpful for other queries I've thought about! Thank you (:

Glad I could help somehow 🙂

Thanks for taking the time and replying. I'm going to give it a try on Monday and come back with the results. I really appreciate your help!

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.