Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The ultimate Microsoft Fabric, Power BI, Azure AI & SQL learning event! Join us in Las Vegas from March 26-28, 2024. Use code MSCUST for a $100 discount. Register Now

Nathaniel_C

Using EARLIER() with a SMART INDEX.

Many times, we need to find a value from a different row than the one we are on, in order to use that value for a formula. In Excel the engine knows that cell C1 is in the row above cell C2 and thus we can subtract cell C2 from C1. In Power BI the sort order for each column may be non-sequential to our way of thinking. One exception to that rule is that we think of a date column as being sorted from earliest date to the latest date. The Power BI engine does as well. 

 

But what happens if we have a date column that has many duplicate dates, or times, or no date column at all?

 

There are many articles written about the use of EARLIER() because a lot of users have a hard time with it. Rob Collie suggests that it is misnamed and should be called CURRENTROW(). One of the issues is that Power BI may store data in an other-than-expected pattern for the column we need to use, and so when using EARLIER we need a date column or an Index.  An Index column is another column that humans and Power BI think is sorted in the same manner. In Power Query an Index column is sequential, starting with 0 or 1 and incrementing each row.

 

One of the solutions in using EARLIER (), is to add an Index column in Power Query so that whatever row we are on, if we want the row above, we can write DAX to filter the table so that we can retrieve the value from the row where the index is one less, (meaning the row above.) So, if we are on row 256, and want the previous value, we ask for -1 which would give us the row 255. DAX gets the value from whatever column we indicated in the previous row, or the row two above if we use -2, etc.

 

So on to the problem.

 

Our DAX pilgrim posted:

 

“I have cumulative hourly meter readings (kWh) from 01/08/2018 00:00 to 31/08/2018 00:00. Each of meter serial is unique and belongs to a certain feeder. what I want to achieve is to evaluate (using DAX column or a measure) for each meter in each feeder the hourly (kwh) from the cumulative i.e the difference between the hourly time steps readings. Thanks in advance.”

 

Nicely explained with a table included.

 

In solving this issue, one might be tempted to use EARLIER with an Index Column. However, as it stands this will not work because the previous read for a meter may be many rows away in the table, and the date column has duplicate values. One might think that using a visual like a Matrix would do the filtering for us, but if we don’t write this in DAX, we are limited in where we might use this. One might think of doing the filtering and calculations in Excel and then uploading the results would be a work around, but then the ability to update automatically from the source is lost. So, let us do this in Power BI and Power Query using DAX.

 

 

After thinking on this issue and trying various approaches, I did three things:

  1. Entered a table that conformed with the poster’s logic
  2. Created a unique identifier for the table
  3. Created a Smart Index

 

Although the poster had written that each meter serial number was unique it was not clear to me whether that meant for each feeder or overall. Better safe than sorry, I merged the columns containing the Feeder identity, and the meter serial number to create a unique identifier that covered all Feeders and all Meters.

 

The next step is the crux of the solution to my mind. In my rock-climbing days, the crux was that one essential move that you had to use to make it to the top. And like many problems, looking back on the solution, it may seem simple, but while you are hanging on the rock face trying to solve it, it is never simple.

 

Many of us know in Power Query how to click on the little arrow at the top of the column and access the sort order.  Simple, yes? But what to sort and why? I sorted first on the unique identifier. That brought each meter next to itself. Then I sorted the Date Time column to put the earlier time first.

 

And then I inserted the index. Let me repeat that. And then I inserted the index.

 

Normally, as I mentioned above, we use an index to be able to move one (or more) rows up. Like a ladder. I am here and I want to go there. However, in this case by adding the index after the sorting, the index captured the sort order. So now the index contains some explicit intelligence that a regular index only carries implicitly. We have a Smart Index!  Which is a lot more powerful! This Smart Index knows how to “pair” up the two readings that we need, and equally important, which reading came first.

 

 

Following is the column code and the PBIX that I created. Smart Index

Hourly Difference = 
                    var firstread = CALCULATE ( MAX( Meter[Hourly Reading] ), FILTER( ALL( Meter), Meter[Smart Index] = EARLIER( Meter[Smart Index] ) -1 ))  //finds the previous reading using Smart Index, so essentially sorting and filtering here 
                    var currentread = Meter[Hourly Reading]  //finds the current reading 
                    var result = IF ( Meter[Smart Index]=0, 0, currentread - firstread ) // 3 possible values, 0, blank or the difference. Here either 0 or the difference, the next line the difference or blank


                
  return   IF( Meter[Unique ID] = CALCULATE( MAX( Meter[Unique ID] ), FILTER( ALL( Meter ), Meter[Smart Index] = EARLIER( Meter[Smart Index] )-1 )), result)

In the return we want to make sure that of the pair, that we are on second row of the pair. So, we insert a test.  And what is cool, is that we can use pretty much the same DAX, except instead of the value from [Hourly reading] we are looking for the value from [Unique ID]. What we want to know is, do both rows contain the same meter value? If the value from the current [Unique ID] is the same as the value from the previous [Unique ID] then go ahead and subtract the one from the other (result) otherwise do nothing which Power BI interprets as blank because we do not want to compare one meter's value to another meter's value.

 

Using a Smart Index that encapsulates the order with which we would like to see our data used with EARLIER() is a win-win situation.  It essentially reduces the need for filtering as well. Keep your eyes open for other situations that a Smart Index could be applicable.

Comments