Solved: Re: Dynamically calculating consecutive data 'n' p...

niharika0913 · ‎07-06-2017

Hi,

I am trying to plot Statistical Process Charts in Power BI. There is no object created for it in Power BI, so I am trying to bring in the calculations using the DAX engine. Looks alright until now. But, there are some rules that guide SPC charts. Example, 2 consecutive points above the calculated mean are out-of-control.

This is how the line chart look. Green, yellow and red lines are the calculated measures using DAX. My question is, how to add a measure that can find ‘n’ (can be 2,3,4) consecutive points that are above calculated mean? I want to mark them on the graph but if not possible then at least show those data points in a table.

The line graph changes dynamically on filter selection and hence the DAX calculated measures. Please suggest if there is way to calculate and display out-of-control (conditional) consecutive points in a table for the filter selection. Thanks.

OwenAuger · ‎07-17-2017

@kangsoonleong

If I understand you correctly, you are trying to calculate the number of "above average runs" within the selected time range, an interesting extension of the original question.

One way to do this would be to:

Create a measure Current Above Average Run Start Index that identifies, for a given date, the "start index" of the current run. This is used to uniquely identify each run.
Create a measure Count of Above Average Runs which counts the distinct values of the previous measure.

I'm not sure about performance for a large dataset, but here are some possible definitions. Note that the first measure is just a slight modification of the Current Above Average Run Length measure.

Sample model here

Current Above Average Run Start Index =
IF (
    HASONEVALUE ( 'Date'[Date] ),
    VAR MeasureAverage = [Your Measure Average]
    VAR MeasureCurrent = [Your Measure]
    RETURN
        IF (
            MeasureCurrent > MeasureAverage,
            VAR CurrentDate =
                VALUES ( 'Date'[Date] )
            VAR VisibleDates =
                CALCULATETABLE ( SUMMARIZE ( Data, 'Date'[Date] ), ALLSELECTED ( 'Date' ) )
            VAR VisibleDatesWithIndex =
                ADDCOLUMNS ( VisibleDates, "Index", RANKX ( VisibleDates, 'Date'[Date],, ASC ) )
            VAR DatesLeft =
                FILTER ( VisibleDatesWithIndex, 'Date'[Date] <= CurrentDate )
            VAR DatesRight =
                FILTER ( VisibleDatesWithIndex, 'Date'[Date] >= CurrentDate )
            VAR DatesLeftNotAboveAverage =
                FILTER ( DatesLeft, [Your Measure] <= MeasureAverage )
            VAR DatesRightNotAboveAverage =
                FILTER ( DatesRight, [Your Measure] <= MeasureAverage )
            VAR RunStart =
                IF (
                    ISEMPTY ( DatesLeftNotAboveAverage ),
                    MINX ( DatesLeft, [Index] ),
                    MAXX ( DatesLeftNotAboveAverage, [Index] ) + 1
                )
            VAR RunEnd =
                IF (
                    ISEMPTY ( DatesRightNotAboveAverage ),
                    MAXX ( DatesRight, [Index] ),
                    MINX ( DatesRightNotAboveAverage, [Index] ) - 1
                )
            RETURN
                IF ( RunEnd - RunStart + 1 >= [Run Length Threshold], RunStart )
        )
)

Count of Above Average Runs =
VAR VisibleDatesWithRunStartIndex =
    ADDCOLUMNS (
        ALLSELECTED ( 'Date'[Date] ),
        "RunStart", [Current Above Average Run Start Index]
    )
RETURN
    COUNTROWS (
        FILTER (
            SUMMARIZE ( VisibleDatesWithRunStartIndex, [RunStart] ),
            NOT ( ISBLANK ( [RunStart] ) )
        )
    )

At least these demonstrate that it can be done, but performance may need to be looked at for a large model.

Cheers

Owen

Owen Auger
Did I answer your question? Mark my post as a solution!
Blog
Twitter
LinkedIn

View solution in original post

OwenAuger · ‎07-08-2017

@niharika0913

Here is a link to a sample model showing an example of what I believe you want to do.

My sample visual looks like this:

The key requirement to get this to work is to calculate the length of the current above-average run.

My model has a Data table with columns Date/Value/Type, related to a Date table.

The series of measures I created to get to the result are:

Your Measure is the base measure that is being plotted.

Your Measure = 
SUM ( Data[Value] )

Your Measure Average is the average that you want to compare Your Measure to on any given date. This should be constant over time. Can be defined however you like.

Your Measure Average for Display returns the average value only if Your Measure is nonblank (for use on the visual).

Your Measure Average = 
// Average of Your Measure over all dates
AVERAGEX ( ALL ( 'Date'[Date] ), [Your Measure] )

Your Measure Average for Display = 
// Only display average where Your Measure is nonblank
IF (
 NOT ( ISBLANK ( [Your Measure] ) ),
 [Your Measure Average]
)

Current Above Average Run Length is the most important measure, which calculates the current run length of above-average values visible in the visual. It basically starts from the current date and looks left and right to find the start/end points of the current run. This would have been difficult and probably unreadable without variables. Note: It returns blank if the current date is not within a run.

Current Above Average Run Length = 
IF (
    HASONEVALUE ( 'Date'[Date] ),
    VAR MeasureAverage = [Your Measure Average]
    VAR MeasureCurrent = [Your Measure]
    RETURN
        IF (
            MeasureCurrent > MeasureAverage,
            VAR CurrentDate =
                VALUES ( 'Date'[Date] )
            VAR VisibleDates =
                CALCULATETABLE ( SUMMARIZE ( Data, 'Date'[Date] ), ALLSELECTED ( 'Date' ) )
            VAR VisibleDatesWithIndex =
                ADDCOLUMNS ( VisibleDates, "Index", RANKX ( VisibleDates, 'Date'[Date],, ASC ) )
            VAR DatesLeft =
                FILTER ( VisibleDatesWithIndex, 'Date'[Date] <= CurrentDate )
            VAR DatesRight =
                FILTER ( VisibleDatesWithIndex, 'Date'[Date] >= CurrentDate )
            VAR DatesLeftNotAboveAverage =
                FILTER ( DatesLeft, [Your Measure] <= MeasureAverage )
            VAR DatesRightNotAboveAverage =
                FILTER ( DatesRight, [Your Measure] <= MeasureAverage )
            VAR RunStart =
                IF (
                    ISEMPTY ( DatesLeftNotAboveAverage ),
                    MINX ( DatesLeft, [Index] ),
                    MAXX ( DatesLeftNotAboveAverage, [Index] ) + 1
                )
            VAR RunEnd =
                IF (
                    ISEMPTY ( DatesRightNotAboveAverage ),
                    MAXX ( DatesRight, [Index] ),
                    MINX ( DatesRightNotAboveAverage, [Index] ) - 1
                )
            RETURN
                RunEnd - RunStart
                    + 1
        )
)

Run Length Threshold harvests the threshold run length from a slicer selection, otherwise defaults to 2.

Run Length Threshold = 
SELECTEDVALUE( 'Run Length'[Run Length], 2 )
// Default Value = 2

Your Measure If Within Above Average Run just returns Your Measure for points within a qualifying run.

Your Measure If Within Above Average Run = 
IF (
    [Current Above Average Run Length] >= [Run Length Threshold],
    [Your Measure]
)

I'm not sure how 'optimal' this is performance-wise, but it's a starting point.

Regards,

Owen

Owen Auger
Did I answer your question? Mark my post as a solution!
Blog
Twitter
LinkedIn

kangsoonleong · ‎07-17-2017

Hi,

Tried your solution and it works flawlessly.

However, i am trying to count the number of occurences over the threshold using existing measures, but we cannot do a distnct count of measure(Current Above Average Run Length).

For example, in the same visual, the occurences should be 2, the ones highlighted in RED. Any Clues how to do it in DAx?

OwenAuger · ‎07-17-2017

@kangsoonleong

If I understand you correctly, you are trying to calculate the number of "above average runs" within the selected time range, an interesting extension of the original question.

One way to do this would be to:

Create a measure Current Above Average Run Start Index that identifies, for a given date, the "start index" of the current run. This is used to uniquely identify each run.
Create a measure Count of Above Average Runs which counts the distinct values of the previous measure.

I'm not sure about performance for a large dataset, but here are some possible definitions. Note that the first measure is just a slight modification of the Current Above Average Run Length measure.

Sample model here

Current Above Average Run Start Index =
IF (
    HASONEVALUE ( 'Date'[Date] ),
    VAR MeasureAverage = [Your Measure Average]
    VAR MeasureCurrent = [Your Measure]
    RETURN
        IF (
            MeasureCurrent > MeasureAverage,
            VAR CurrentDate =
                VALUES ( 'Date'[Date] )
            VAR VisibleDates =
                CALCULATETABLE ( SUMMARIZE ( Data, 'Date'[Date] ), ALLSELECTED ( 'Date' ) )
            VAR VisibleDatesWithIndex =
                ADDCOLUMNS ( VisibleDates, "Index", RANKX ( VisibleDates, 'Date'[Date],, ASC ) )
            VAR DatesLeft =
                FILTER ( VisibleDatesWithIndex, 'Date'[Date] <= CurrentDate )
            VAR DatesRight =
                FILTER ( VisibleDatesWithIndex, 'Date'[Date] >= CurrentDate )
            VAR DatesLeftNotAboveAverage =
                FILTER ( DatesLeft, [Your Measure] <= MeasureAverage )
            VAR DatesRightNotAboveAverage =
                FILTER ( DatesRight, [Your Measure] <= MeasureAverage )
            VAR RunStart =
                IF (
                    ISEMPTY ( DatesLeftNotAboveAverage ),
                    MINX ( DatesLeft, [Index] ),
                    MAXX ( DatesLeftNotAboveAverage, [Index] ) + 1
                )
            VAR RunEnd =
                IF (
                    ISEMPTY ( DatesRightNotAboveAverage ),
                    MAXX ( DatesRight, [Index] ),
                    MINX ( DatesRightNotAboveAverage, [Index] ) - 1
                )
            RETURN
                IF ( RunEnd - RunStart + 1 >= [Run Length Threshold], RunStart )
        )
)

Count of Above Average Runs =
VAR VisibleDatesWithRunStartIndex =
    ADDCOLUMNS (
        ALLSELECTED ( 'Date'[Date] ),
        "RunStart", [Current Above Average Run Start Index]
    )
RETURN
    COUNTROWS (
        FILTER (
            SUMMARIZE ( VisibleDatesWithRunStartIndex, [RunStart] ),
            NOT ( ISBLANK ( [RunStart] ) )
        )
    )

At least these demonstrate that it can be done, but performance may need to be looked at for a large model.

Cheers

Owen

Owen Auger
Did I answer your question? Mark my post as a solution!
Blog
Twitter
LinkedIn

niharika0913 · ‎07-09-2017

Thank you so much for your reply. I will try and implement your solution. Will let you know if it works for me.

>> One doubt Owen, I could not find function SELECTEDVALUE( ). It gives an error "It is not a valid function name".

OwenAuger · ‎07-09-2017

@niharika0913

Oh sure...I think SELECTEDVALUE was added in the Power BI Desktop July 2017 Update (2.48.4792.321).

In earlier versions you can write that measure this way without the SELECTEDVALUE function:

Run Length Threshold = 
IF (
    HASONEVALUE ( 'Run Length'[Run Length] ),
    VALUES ( 'Run Length'[Run Length] ),
    2
)
// Default Value = 2

Owen Auger
Did I answer your question? Mark my post as a solution!
Blog
Twitter
LinkedIn

v-shex-msft · ‎07-07-2017

Hi @niharika0913,

Since you use legend field, you can't add multiple value fields to line chart. So, I'd like to suggest you write a measure to calculate average of each legend.

Sample measure:

AVG =
var current_Type=LASTNONBLANK(Table[Type],[Type])
return
AVERAGEX(FILTER(ALLSELECTED(Table),[Type]=current_Type),[Amount])

Regards,

Xiaoxin Sheng

Community Support Team _ Xiaoxin
If this post helps, please consider accept as solution to help other members find it more quickly.

niharika0913 · ‎07-07-2017

Hi, Thanks for your reply. Can you explain your answer. My question is on how to calculate consecutive points that are above say average of the filtered data? Average is a calculated measure. So, I am trying to find those occurrences where 3 consecutive points are above average and state those occurrences in a table/on graph.

Dynamically calculating consecutive data 'n' points

Helpful resources

Microsoft Fabric Learn Together

Power BI Monthly Update - April 2024

Fabric Community Update - April 2024

How to Get Your Question Answered Quickly