Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
niharika0913
Frequent Visitor

Dynamically calculating consecutive data 'n' points

Hi,

 

I am trying to plot Statistical Process Charts in Power BI. There is no object created for it in Power BI, so I am trying to bring in the calculations using the DAX engine. Looks alright until now. But, there are some rules that guide SPC charts. Example, 2 consecutive points above the calculated mean are out-of-control.

 

This is how the line chart look. Green, yellow and red lines are the calculated measures using DAX.  My question is, how to  add a measure that can find ‘n’ (can be 2,3,4) consecutive points that are above calculated mean? I want to mark them on the graph but if not possible then at least show those data points in a table.

 image009.png

 The line graph changes dynamically on filter selection and hence the DAX calculated measures. Please suggest if there is way to calculate and display out-of-control (conditional) consecutive points in a table for the filter selection. Thanks.

 

 

1 ACCEPTED SOLUTION

@kangsoonleong

 

If I understand you correctly, you are trying to calculate the number of "above average runs" within the selected time range, an interesting extension of the original question.

 

One way to do this would be to:

  1. Create a measure Current Above Average Run Start Index that identifies, for a given date, the "start index" of the current run. This is used to uniquely identify each run.
  2. Create a measure Count of Above Average Runs which counts the distinct values of the previous measure.

I'm not sure about performance for a large dataset, but here are some possible definitions. Note that the first measure is just a slight modification of the Current Above Average Run Length measure.

 

Sample model here

 

 

Current Above Average Run Start Index =
IF (
    HASONEVALUE ( 'Date'[Date] ),
    VAR MeasureAverage = [Your Measure Average]
    VAR MeasureCurrent = [Your Measure]
    RETURN
        IF (
            MeasureCurrent > MeasureAverage,
            VAR CurrentDate =
                VALUES ( 'Date'[Date] )
            VAR VisibleDates =
                CALCULATETABLE ( SUMMARIZE ( Data, 'Date'[Date] ), ALLSELECTED ( 'Date' ) )
            VAR VisibleDatesWithIndex =
                ADDCOLUMNS ( VisibleDates, "Index", RANKX ( VisibleDates, 'Date'[Date],, ASC ) )
            VAR DatesLeft =
                FILTER ( VisibleDatesWithIndex, 'Date'[Date] <= CurrentDate )
            VAR DatesRight =
                FILTER ( VisibleDatesWithIndex, 'Date'[Date] >= CurrentDate )
            VAR DatesLeftNotAboveAverage =
                FILTER ( DatesLeft, [Your Measure] <= MeasureAverage )
            VAR DatesRightNotAboveAverage =
                FILTER ( DatesRight, [Your Measure] <= MeasureAverage )
            VAR RunStart =
                IF (
                    ISEMPTY ( DatesLeftNotAboveAverage ),
                    MINX ( DatesLeft, [Index] ),
                    MAXX ( DatesLeftNotAboveAverage, [Index] ) + 1
                )
            VAR RunEnd =
                IF (
                    ISEMPTY ( DatesRightNotAboveAverage ),
                    MAXX ( DatesRight, [Index] ),
                    MINX ( DatesRightNotAboveAverage, [Index] ) - 1
                )
            RETURN
                IF ( RunEnd - RunStart + 1 >= [Run Length Threshold], RunStart )
        )
)
Count of Above Average Runs =
VAR VisibleDatesWithRunStartIndex =
    ADDCOLUMNS (
        ALLSELECTED ( 'Date'[Date] ),
        "RunStart", [Current Above Average Run Start Index]
    )
RETURN
    COUNTROWS (
        FILTER (
            SUMMARIZE ( VisibleDatesWithRunStartIndex, [RunStart] ),
            NOT ( ISBLANK ( [RunStart] ) )
        )
    )

At least these demonstrate that it can be done, but performance may need to be looked at for a large model.

 

Cheers

Owen

 


Owen Auger
Did I answer your question? Mark my post as a solution!
Blog
Twitter
LinkedIn

View solution in original post

7 REPLIES 7
OwenAuger
Super User
Super User

@niharika0913

 

Here is a link to a sample model showing an example of what I believe you want to do.

 

My sample visual looks like this:Capture.PNG

 

 

 

The key requirement to get this to work is to calculate the length of the current above-average run.

 

My model has a Data table with columns Date/Value/Type, related to a Date table.

 

The series of measures I created to get to the result are:

 

Your Measure is the base measure that is being plotted.

 

Your Measure = 
SUM ( Data[Value] )

Your Measure Average is the average that you want to compare Your Measure to on any given date. This should be constant over time. Can be defined however you like.

Your Measure Average for Display returns the average value only if Your Measure is nonblank (for use on the visual).

 

Your Measure Average = 
// Average of Your Measure over all dates
AVERAGEX ( ALL ( 'Date'[Date] ), [Your Measure] )

Your Measure Average for Display =
// Only display average where Your Measure is nonblank
IF (
NOT ( ISBLANK ( [Your Measure] ) ),
[Your Measure Average]
)

Current Above Average Run Length is the most important measure, which calculates the current run length of above-average values visible in the visual. It basically starts from the current date and looks left and right to find the start/end points of the current run. This would have been difficult and probably unreadable without variables. Note: It returns blank if the current date is not within a run.

 

 

Current Above Average Run Length = 
IF (
    HASONEVALUE ( 'Date'[Date] ),
    VAR MeasureAverage = [Your Measure Average]
    VAR MeasureCurrent = [Your Measure]
    RETURN
        IF (
            MeasureCurrent > MeasureAverage,
            VAR CurrentDate =
                VALUES ( 'Date'[Date] )
            VAR VisibleDates =
                CALCULATETABLE ( SUMMARIZE ( Data, 'Date'[Date] ), ALLSELECTED ( 'Date' ) )
            VAR VisibleDatesWithIndex =
                ADDCOLUMNS ( VisibleDates, "Index", RANKX ( VisibleDates, 'Date'[Date],, ASC ) )
            VAR DatesLeft =
                FILTER ( VisibleDatesWithIndex, 'Date'[Date] <= CurrentDate )
            VAR DatesRight =
                FILTER ( VisibleDatesWithIndex, 'Date'[Date] >= CurrentDate )
            VAR DatesLeftNotAboveAverage =
                FILTER ( DatesLeft, [Your Measure] <= MeasureAverage )
            VAR DatesRightNotAboveAverage =
                FILTER ( DatesRight, [Your Measure] <= MeasureAverage )
            VAR RunStart =
                IF (
                    ISEMPTY ( DatesLeftNotAboveAverage ),
                    MINX ( DatesLeft, [Index] ),
                    MAXX ( DatesLeftNotAboveAverage, [Index] ) + 1
                )
            VAR RunEnd =
                IF (
                    ISEMPTY ( DatesRightNotAboveAverage ),
                    MAXX ( DatesRight, [Index] ),
                    MINX ( DatesRightNotAboveAverage, [Index] ) - 1
                )
            RETURN
                RunEnd - RunStart
                    + 1
        )
)

Run Length Threshold harvests the threshold run length from a slicer selection, otherwise defaults to 2.

 

 

Run Length Threshold = 
SELECTEDVALUE( 'Run Length'[Run Length], 2 )
// Default Value = 2

Your Measure If Within Above Average Run just returns Your Measure for points within a qualifying run.

Your Measure If Within Above Average Run = 
IF (
    [Current Above Average Run Length] >= [Run Length Threshold],
    [Your Measure]
)

I'm not sure how 'optimal' this is performance-wise, but it's a starting point.

 

Regards,

Owen

 


Owen Auger
Did I answer your question? Mark my post as a solution!
Blog
Twitter
LinkedIn

Hi,

 

Tried your solution and it works flawlessly.

 

However, i am trying to count the number of occurences over the threshold using existing measures, but we cannot do a distnct count of measure(Current Above Average Run Length).

 

For example, in the same visual, the occurences should be 2, the ones highlighted in RED.  Any Clues how to do it in DAx?

@kangsoonleong

 

If I understand you correctly, you are trying to calculate the number of "above average runs" within the selected time range, an interesting extension of the original question.

 

One way to do this would be to:

  1. Create a measure Current Above Average Run Start Index that identifies, for a given date, the "start index" of the current run. This is used to uniquely identify each run.
  2. Create a measure Count of Above Average Runs which counts the distinct values of the previous measure.

I'm not sure about performance for a large dataset, but here are some possible definitions. Note that the first measure is just a slight modification of the Current Above Average Run Length measure.

 

Sample model here

 

 

Current Above Average Run Start Index =
IF (
    HASONEVALUE ( 'Date'[Date] ),
    VAR MeasureAverage = [Your Measure Average]
    VAR MeasureCurrent = [Your Measure]
    RETURN
        IF (
            MeasureCurrent > MeasureAverage,
            VAR CurrentDate =
                VALUES ( 'Date'[Date] )
            VAR VisibleDates =
                CALCULATETABLE ( SUMMARIZE ( Data, 'Date'[Date] ), ALLSELECTED ( 'Date' ) )
            VAR VisibleDatesWithIndex =
                ADDCOLUMNS ( VisibleDates, "Index", RANKX ( VisibleDates, 'Date'[Date],, ASC ) )
            VAR DatesLeft =
                FILTER ( VisibleDatesWithIndex, 'Date'[Date] <= CurrentDate )
            VAR DatesRight =
                FILTER ( VisibleDatesWithIndex, 'Date'[Date] >= CurrentDate )
            VAR DatesLeftNotAboveAverage =
                FILTER ( DatesLeft, [Your Measure] <= MeasureAverage )
            VAR DatesRightNotAboveAverage =
                FILTER ( DatesRight, [Your Measure] <= MeasureAverage )
            VAR RunStart =
                IF (
                    ISEMPTY ( DatesLeftNotAboveAverage ),
                    MINX ( DatesLeft, [Index] ),
                    MAXX ( DatesLeftNotAboveAverage, [Index] ) + 1
                )
            VAR RunEnd =
                IF (
                    ISEMPTY ( DatesRightNotAboveAverage ),
                    MAXX ( DatesRight, [Index] ),
                    MINX ( DatesRightNotAboveAverage, [Index] ) - 1
                )
            RETURN
                IF ( RunEnd - RunStart + 1 >= [Run Length Threshold], RunStart )
        )
)
Count of Above Average Runs =
VAR VisibleDatesWithRunStartIndex =
    ADDCOLUMNS (
        ALLSELECTED ( 'Date'[Date] ),
        "RunStart", [Current Above Average Run Start Index]
    )
RETURN
    COUNTROWS (
        FILTER (
            SUMMARIZE ( VisibleDatesWithRunStartIndex, [RunStart] ),
            NOT ( ISBLANK ( [RunStart] ) )
        )
    )

At least these demonstrate that it can be done, but performance may need to be looked at for a large model.

 

Cheers

Owen

 


Owen Auger
Did I answer your question? Mark my post as a solution!
Blog
Twitter
LinkedIn

Thank you so much for your reply. I will try and implement your solution. Will let you know if it works for me.

 

 

 

>> One doubt Owen, I could not find function SELECTEDVALUE( ). It gives an error "It is not a valid function name".

@niharika0913

 

Oh sure...I think SELECTEDVALUE was added in the Power BI Desktop July 2017 Update (2.48.4792.321).

 

In earlier versions you can write that measure this way without the SELECTEDVALUE function:

Run Length Threshold = 
IF (
    HASONEVALUE ( 'Run Length'[Run Length] ),
    VALUES ( 'Run Length'[Run Length] ),
    2
)
// Default Value = 2

 


Owen Auger
Did I answer your question? Mark my post as a solution!
Blog
Twitter
LinkedIn
v-shex-msft
Community Support
Community Support

Hi @niharika0913,

 

Since you use legend field, you can't add multiple value fields to line chart. So, I'd like to suggest you write a measure to calculate average of each legend.

 

Sample measure:

AVG =
var current_Type=LASTNONBLANK(Table[Type],[Type])
return
AVERAGEX(FILTER(ALLSELECTED(Table),[Type]=current_Type),[Amount]) 

 

 

Regards,

Xiaoxin Sheng

Community Support Team _ Xiaoxin
If this post helps, please consider accept as solution to help other members find it more quickly.

Hi, Thanks for your reply. Can you explain your answer. My question is on how to calculate consecutive points that are above say average of the filtered data? Average is a calculated measure. So, I am trying to find those occurrences where 3 consecutive points are above average and state those occurrences in a table/on graph.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.