cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
justlogmein
Helper III
Helper III

How to interpolate data that is in groups to estimate missing values?

I am currently doing an engine oil analysis for mining equipment and I have noticed that many of the engine hour readings on the oil sample cards are incorrect. Some are obvious outliers, but others are more subtle. All of the readings should stay the same or increase as the date increases, but some dip on the next date or go up on a previous date.

 

Because I can't work out which of these readings is the incorrect one in all scenarios, I thought it would be best to simply remove any I am not sure about (including the outliers) and then interpolate the readings using the dates as the independent variable. I also need to group these by machine, called Unit Number in the data.

 

I am having some trouble interpolating and am finding very little information on DAX interpolation. I have tried to do it myself using the Y = Y1 + (Y2 - Y1)/(X2 - X1) * (X - X1) formula, but it doesn't seem to be working for me. Would someone mind looking at my code to see how I can complete the very last column? (I have done this in an Excel file as Power BI has a limit on the rows you can enter into Power Query manually, which would mean two separate files).


DAX Interpolation Problem 

 

Untitled.png

Untitled2.png

1 ACCEPTED SOLUTION

Hi @justlogmein 
Actually I have double checked. There is no problem with the interpolation. It is 100% accurate but the thing is you the intervals in the x-axis are not uniform which makes the chart looks odd. I think you need to proceed with a different aproach uniforming the intervals at the x-axis. 

View solution in original post

11 REPLIES 11
tamerj1
Super User
Super User

Hi @justlogmein 
Here is the sample file with the solution https://www.dropbox.com/t/iRbNBQW2Jfh0ylaw

1.png2.png

=
VAR CurrentValue = [Engine Meter Reading - Valid Only]
VAR CurrentDate = 'Sample Data'[Sample Date]
VAR CurrentUnitTable = FILTER ( CALCULATETABLE ( 'Sample Data', ALLEXCEPT ( 'Sample Data', 'Sample Data'[Unit Number] ) ), [Engine Meter Reading - Valid Only] <> BLANK ( ) )
VAR DatesBefore = FILTER ( CurrentUnitTable, 'Sample Data'[Sample Date] < CurrentDate )
VAR DatesAfter = FILTER ( CurrentUnitTable, 'Sample Data'[Sample Date] > CurrentDate )
VAR PreviousDate = MAXX ( DatesBefore, 'Sample Data'[Sample Date] )
VAR NextDate = MINX ( DatesAfter,  'Sample Data'[Sample Date] )
VAR DateDifference1 = DATEDIFF ( PreviousDate, NextDate, DAY ) 
VAR DateDifference2 = DATEDIFF ( PreviousDate, CurrentDate, DAY ) 
VAR PreviousValue = MAXX ( FILTER ( DatesBefore, 'Sample Data'[Sample Date] = PreviousDate ), 'Sample Data'[Engine Meter Reading - Valid Only] )
VAR NextValue = MAXX ( FILTER ( DatesAfter, 'Sample Data'[Sample Date] = NextDate ), 'Sample Data'[Engine Meter Reading - Valid Only] )
VAR ValueDifference = NextValue - PreviousValue
RETURN
	IF ( ISBLANK ( CurrentValue ), PreviousValue + DIVIDE ( ValueDifference, DateDifference1 ) * DateDifference2, CurrentValue )

Thank you Tamer. I see that there are some large jumps in the hours in the output there. I would have expected the interpolated data (for missing values) to be half the difference of the two two known numbers on each side of it, so essentially a straight line between between the known ones. I'm wondering if the multiple missing values in a row is affecting the calculation?

Hi @justlogmein 
Actually I have double checked. There is no problem with the interpolation. It is 100% accurate but the thing is you the intervals in the x-axis are not uniform which makes the chart looks odd. I think you need to proceed with a different aproach uniforming the intervals at the x-axis. 

Thank you. Would you happen to know how I could do this in DAX? I could easily interpolate the data, even if not uniform, in Excel, but I am not that experienced with DAX to know what formulas to use.

@justlogmein 
What is the time interval that you're looking for?

Only for those dates that are on there. The dates will be sporadic, but all I am trying to do is fill in the gaps for the missing ones i.e. draw a direct across the top of the known ones on the graph and fill up to that line for the missing dates.

Hi @justlogmein 
You can create a date table and use it for slicing. A measure will be more practical than a calculated column https://www.dropbox.com/t/fEcL3Hmb9Ob0ea3Q

Engine Hours:=VAR CurrentValue = MAX ( 'Sample Data'[Engine Meter Reading - Valid Only] )
VAR MaxDateWithData = CALCULATE ( MAX ( 'Sample Data'[Sample Date] ), ALL ( 'Calendar' ) )
VAR CurrentDate = MAX ( 'Calendar'[Date] )
VAR CurrentUnitTable = FILTER ( CALCULATETABLE ( 'Sample Data', ALLEXCEPT ( 'Sample Data', 'Sample Data'[Unit Number] ) ), [Engine Meter Reading - Valid Only] <> BLANK ( ) )
VAR DatesBefore = FILTER ( CurrentUnitTable, 'Sample Data'[Sample Date] < CurrentDate )
VAR DatesAfter = FILTER ( CurrentUnitTable, 'Sample Data'[Sample Date] > CurrentDate )
VAR PreviousDate = MAXX ( DatesBefore, 'Sample Data'[Sample Date] )
VAR NextDate = MINX ( DatesAfter,  'Sample Data'[Sample Date] )
VAR DateDifference1 = DATEDIFF ( PreviousDate, NextDate, DAY ) 
VAR DateDifference2 = DATEDIFF ( PreviousDate, CurrentDate, DAY ) 
VAR PreviousValue = MAXX ( FILTER ( DatesBefore, 'Sample Data'[Sample Date] = PreviousDate ), 'Sample Data'[Engine Meter Reading - Valid Only] )
VAR NextValue = MAXX ( FILTER ( DatesAfter, 'Sample Data'[Sample Date] = NextDate ), 'Sample Data'[Engine Meter Reading - Valid Only] )
VAR ValueDifference = NextValue - PreviousValue
VAR Result =
	IF ( 
		MaxDateWithData >= CurrentDate,
		IF ( ISBLANK ( CurrentValue ), PreviousValue + DIVIDE ( ValueDifference, DateDifference1 ) * DateDifference2, CurrentValue )
	)
RETURN
	Result

2.png1.png

 

AlB
Super User
Super User

Can you share the pbix where youa re implementing this? I do not see any formula in the Excel you shared and in any case we'd still ave to look at the DAX code to see what is wrong

 

SU18_powerbi_badge

Please accept the solution when done and consider giving a thumbs up if posts are helpful. 

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

 

In the Excel file, go to Data > Load Data Model

justlogmein
Helper III
Helper III

Yes, you'll see I have already gotten the previous and next values needed for the formula, but it is not giving the correct results. 

AlB
Super User
Super User

Hi @justlogmein 

You can use the solutions to the question you posted earlier to get the previous and next Y values and X values and calculate the interpolated value with the formula you show

 

SU18_powerbi_badge

Please accept the solution when done and consider giving a thumbs up if posts are helpful. 

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

 

Helpful resources

Announcements
September Update

Check it Out!

Click here to learn more about the September 2022 updates!

Power BI Dev Camp Session 26

Check it Out!

Mark your calendars and join us on Thursday, September 29 at 11a PDT for a great session with Ted Pattison!

Top Solution Authors