Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Greg_Deckler

Linear Interpolation with Power BI

Introduction

If you are a mechanical engineer or mechanical engineering student, then you can skip past this introduction and the background information and just get right to the solution. I know this from my own experience in earning my Mechanical Engineering degree at Ohio Northern University. I had this professor, a Dr. Jed Marquart, for Thermodynamics and Fluid Mechanics. Nice enough guy, but looking back at it, probably a bit of a sadist. Homework, tests and final exams seemed like 75% of the time spent was on doing linear interpolation calculations. Misery...pure misery.

 

So, since I've been on this kick of demonstrating how to do some numerical methods and other math stuff in Power BI, I recalled the dozens, if not tens of thousands, of hours I spent as an undergraduate painstakingly doing linear interpolation to solve Thermodynamics and Fluid Mechanics problems. And I figured that perhaps, if I could save just one mechanical engineering student just a few tens of thousands of hours of linear interpolation time that it would be worth writing a blog article on how to do linear interpolation with Power BI. Besides, while we generally deal with known data in things like Power BI, having the ability to do interpolation could come in pretty handy for certain situations as well as "What If" analysis. So, here is the technique, but first a little background.

 

Background

Interpolation falls into the mathematical field of numerical analysis. Essentially, interpolation is a method of creating new data points within the range of a set of known data points. So, for example, suppose we have a set of data points like the following for some unknown function f(x) (reference the Wikipedia page here😞

 

x f(x)

0 0

1 0 . 8415

2 0 . 9093

3 0 . 1411

4 −0 . 7568

5 −0 . 9589

6 −0 . 2794

 

But, what we are interested in is what the value of f(x) is for a value of x=2.5. How can we accomplish this? Well, we can assume that the values between our known values of f(x) approximately fall along a straight line, like so:

thermo3.png

 

Using the two-point equation of a line, we get: y - y1 = m (x - x1), where m is the slope of the line. The slope of a line can be calculated with the formula m = (y2 - y1)/(x2 - x1). Thus, for x3 that is between x1 and x2, we get the final formula:

 

y3 = y1 + (x3 - x1) (y2 - y1)/(x2 - x1)

 

Plugging in our values we get:

 

y3 = .9093 + (2.5 - 2)(.1411-.9093)/(3-2)

y3 = .5252

 

So, you might be able to see how this could become a tad annoying if you had to do this 57 times in to solve a single Thermodynamics problem...

 

The Solution

 

Step 1: Get the Data

So, as a practical example of using interpolation, I chose to go grab some Thermodynamic steam tables from this site here. Specifically, this table. Now, to understand this data, these are tables of the properties of water at specific temperatures. These kinds of property tables are very common in Thermodynamics and Fluid Mechanics and come in both chart (diagram) and table form. For example, a Pressure-Enthalpy (P-h) diagram look like:

 

thermo4.png.gif

Nifty. But the problem is that not a single one of Dr. Marquart's problems EVER involved water at a temperature that was actually in these steam tables. Never. Not a single time. But I'm not bitter...

 

Deep breath. OK, so anyway, the data takes a little clean up, here is my Power Query code for cleaning up and importing the data. You will need to change the path in bold below for your own situation.

 

let

   Source = Excel.Workbook(File.Contents("C:\temp\powerbi\Thermo\H2O_TempSat.xls"), null, true),

   Sheet2 = Source{[Name="Sheet1"]}[Data],

   #"Changed Type" = Table.TransformColumnTypes(Sheet2,{{"Column1", type text}, {"Column2", type text}, {"Column3", type text}, {"Column4", type text}, {"Column5", type text}, {"Column6", type text}, {"Column7", type text}, {"Column8", type text}, {"Column9", type text}, {"Column10", type text}, {"Column11", type text}, {"Column12", type text}, {"Column13", type text}}),

   #"Removed Bottom Rows" = Table.RemoveLastN(#"Changed Type",2),

   #"Removed Top Rows" = Table.Skip(#"Removed Bottom Rows",1),

   #"Promoted Headers" = Table.PromoteHeaders(#"Removed Top Rows", [PromoteAllScalars=true]),

   #"Changed Type1" = Table.TransformColumnTypes(#"Promoted Headers",{{" Temp", type text}, {"  Pressure", type text}, {"  volume (m^3/kg)", type text}, {"Column4", type text}, {"   energy (kJ/kg)", type text}, {"Column6", type text}, {"         enthalpy (kJ/kg)", type text}, {"Column8", type text}, {"Column9", type text}, {"   entropy (kJ/kg.K)", type text}, {"Column11", type text}, {"Column12", type text}, {"Column13", type text}}),

   #"Renamed Columns" = Table.RenameColumns(#"Changed Type1",{{" Temp", "Temp (C)"}, {"  Pressure", "Pressure (MPa)"}, {"  volume (m^3/kg)", "volume (m^3/kg) (vf)"}, {"Column4", "volume (m^3/kg) (vg)"}, {"   energy (kJ/kg)", "energy (kJ/kg) (uf)"}, {"Column6", "energy (kJ/kg) (ug)"}, {"         enthalpy (kJ/kg)", "enthalpy (kJ/kg) (hf)"}, {"Column8", "enthalpy (kJ/kg) (hfg)"}, {"Column9", "enthalpy (kJ/kg) (hg)"}, {"   entropy (kJ/kg.K)", "entropy (kJ/kg.K) (sf)"}, {"Column11", "entropy (kJ/kg.K) (sfg)"}, {"Column12", "entropy (kJ/kg.K) (sg)"}}),

   #"Removed Columns" = Table.RemoveColumns(#"Renamed Columns",{"Column13"}),

   #"Removed Top Rows1" = Table.Skip(#"Removed Columns",1),

   #"Changed Type2" = Table.TransformColumnTypes(#"Removed Top Rows1",{{"Temp (C)", type number}, {"Pressure (MPa)", type number}, {"volume (m^3/kg) (vf)", type number}, {"volume (m^3/kg) (vg)", type number}, {"energy (kJ/kg) (uf)", type number}, {"energy (kJ/kg) (ug)", type number}, {"enthalpy (kJ/kg) (hf)", type number}, {"enthalpy (kJ/kg) (hfg)", type number}, {"enthalpy (kJ/kg) (hg)", type number}, {"entropy (kJ/kg.K) (sf)", type number}, {"entropy (kJ/kg.K) (sfg)", type number}, {"entropy (kJ/kg.K) (sg)", type number}})

in

   #"Changed Type2"

 

Step 2: Create Interpolation Table

OK, so what you should have loaded into a table now are the properties of water at temperatures in increments of 5 degrees Celsius from about 0 degrees to about 373 degrees. I named my table H2OTempSat.

 

Next, we want to create a table for our Interpolated values. You can do this by going to the Modeling tab of the ribbon and choosing New Parameter from the What If section or just click on New Table and use the following formula:

 

Interpolation = GENERATESERIES(1,373,1)


You should now have an Interpolation table with numbers from 1 to 373 in increments of 1. Rename the single column in the table to Temp (C).

 

Step 3: Create Interpolation Measure

Note that there is no reason to relate these tables to one another. In fact, I would advise against it, although I have specifically engineered the formula to allow for active relationships between your interpolation table and the lookup table. In any event, create the following measusre:

 

Pressure (MPa) =

VAR x3 = MAX(Interpolation[Temp (C)])

VAR match = CALCULATE(MAX(H2OTempSat[Pressure (MPa)]),FILTER(ALL(H2OTempSat),[Temp (C)]=x3))

VAR x1 = CALCULATE(MAX(H2OTempSat[Temp (C)]),FILTER(ALL(H2OTempSat),[Temp (C)]<=x3))

VAR x2 = CALCULATE(MIN(H2OTempSat[Temp (C)]),FILTER(ALL(H2OTempSat),[Temp (C)]>=x3))

VAR y1 = CALCULATE(MAX(H2OTempSat[Pressure (MPa)]),FILTER(ALL(H2OTempSat),[Temp (C)]<=x3))

VAR y2 = CALCULATE(MIN(H2OTempSat[Pressure (MPa)]),FILTER(ALL(H2OTempSat),[Temp (C)]>=x3))

RETURN IF(NOT(ISBLANK(match)),match,y1 + (x3 - x1) * (y2 - y1)/(x2 - x1))

 

So, what this measure is doing is calculating the Pressure (MPa) from our steam table by interpolating between the known values of pressure at the specific temperatures listed in our steam table. Line by line:

  • The first line is just grabbing the value of the Temperature from our Interpolation table for which we want to interpolate a value for Pressure.
  • The second line is looking to see if there is a match in the lookup table for the value that we have for our temperature. In other words, we will not have to interpolate a value, we already know it even though this would never happen in an Dr. Marquart classroom.
  • The third line grabs the temperature in the steam table that is just below the value we have for our temperature.
  • The fourth line grabs the temperature in the steam table that is just above the value we have for our temperature.
  • The fifth line grabs the pressure in the steam table for the temperature that is just below the value we have for our temperature.
  • The sixth line grabs the pressure in the steam table for the temperature that is just above the value we have for our temperature.
  • The last line checks to see if we have a match in our steam table, and if so just returns the matched value. Otherwise, it performs the linear interpolation calculation.

Note the use of the ALL function within the FILTER function. This is the reason that we can handle relationships between our interpolation table and our lookup table.

 

Step 4: Create the Report

We can now easily create a simple report to show our interpolation results. First, create a Bubble Chart of Temp (C) and Pressure (MPa) from our original table, H2OTempSat. Then, create a second Bubble Chart of Temp (C) and Pressure (MPa) from our Interpolation table.

 

We should end up with something like the report below:

thermo2.png

 

As one can clearly see, the interpolated values for Pressure (MPa) neatly fill in the gaps from our original table.

 

Conclusion

Dr. Marquart was a sadist.

Comments

Added tracking counter

 

 

 

That was a good story to read. I understand that engineers were expected to read the different tables to perform calculations but what's the point of reading different tables and making interpolations and plugging the numbers into formulas when engineers are able to create an app for the entire calculation? 

@GeorgeGokmen - Not entirely sure I understand the question. But, my understanding of things like steam tables is that they are created from from complex equations of state, which in turn were developed from accurate experimental data. So the steam tables are the quick reference to lookup specific properties of liquids and such without having to run through all of the complex equations required to caculate things manually. Granted, this method was created back when there were no computers and such so it's probably possible to calculate these properties via some application today but I wouldn't be surprised if steam tables and such are still widely used in engineering even today. I might be wrong. But they certainly were back in the early 1990's. And, thermodynamics and fluid dynamics are just one example of where interpolation tends to be used. There are many other uses for interpolation. For example, one use that is extremely relevant today is filling in gaps in streaming datasets.

Excellent explanation Greg, I was able to successfully replicate, thanks a lot!

 

I have a slightly more complicated problem, interpolating through different categories / subsets, of an unpivoted dataset. Immagine this starting table:

Petepete_0-1598444865791.png

 

Picking the unique dates, I wanna linearly interpolate (yellow cells) to get

Petepete_1-1598445035352.png

 

Need to do this within powerBI, as the original table is the union to tables sourced from different DBs (Microsoft, Oracle).

Really going crazy about doing it, is it even possible?

 

 

@Petepete - Probably, I'll have to take a closer look.

OK @Petepete - I finally got back around to this. What I would do is start with this:

 

Interpolation Table = 
  GENERATE(
    CALENDAR(MIN('Table'[Date]),MAX('Table'[Date])),
    DISTINCT('Table'[PTF])
  )

 

You could then create a column that was something like:

Column = 
  VAR __Nav = MAXX(FILTER('Table','Table'[Date]=[Date] && 'Table'[PTF]=[PTF]),'Table'[NAV])
RETURN
  IF(NOT(ISBLANK(__Nav)),__Nav, <interpolation goes here>)

To get your values for interpolation, you would do something along the lines of:

VAR __x1 = MAXX(FILTER('Table','Table'[PTF]=[PTF] && 'Table'[Date]<[Date]),[Date])
VAR __x2 = MINX(FILTER('Table','Table'[PTF]=[PTF] && 'Table'[Date]>[Date]),[Date])
VAR __y1 = MAXX(FILTER('Table','Table'[PTF]=[PTF] && 'Table'[Date]=__x1),[NAV])
VAR __y2 = MAXX(FILTER('Table','Table'[PTF]=[PTF] && 'Table'[Date]=__x2),[NAV])

 

@Greg_Deckler truly a master user! Was able to replicate smoothly

Amazing man, thanks v much for that!

@Greg_Deckler I have written a pretty generic interpolation function in m code.

 

It takes a table, an x column name, and a y column name. It returns the same table, but with "missing" y values filled in.

 

https://stackoverflow.com/questions/64067210/how-can-i-interpolate-missing-values-in-a-column-in-pow...

 

Let me know if you have some ways to further improve on the concept.