Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
strachi
Regular Visitor

Moving average over non-numeric values (correct errors, fill missing values)

Hi,

how can I smooth string values in a column?

 

I have time series data (timestamp; string) with some errors or gaps in it:

 

timestamp;string

1521585642;a

1521585643;a

1521585644;

1521585645;a

1521585646;a

1521585647;x

1521585648;a

1521585649;a

1521585650;a

 

I would like to fill the gap and replace the error ("x") with the values in proximity (lets say we want the most frequent value looking at the last 2 and next 2 values). You could call this moving average with strings. The result in this simple example would be all "a" in the string-column.

 

I feel like this comes close, but MAXA does not work with strings of course:

Smooth = 
CALCULATE (
    CALCULATE (
        MAXA( 'timeseries'[string] );
        'timeseries'[Datetime]
            >= VALUES ( 'timeseries'[Datetime] ) - 4 ;
        'timeseries'[Datetime] <= VALUES ( 'timeseries'[Datetime] )
    );
    ALLEXCEPT ( 'timeseries'; 'timeseries'[Tag];'timeseries'[Logfile];'timeseries'[Datetime] )
)

Any Ideas would be greatly appreciated. I was not able to find a solution.

 

 

5 REPLIES 5
v-jiascu-msft
Employee
Employee

Hi @strachi,

 

Can you share a complete sample please? I can't convert the "timestamp" into a time or a date.

 

 

Best Regards,

Dale

Community Support Team _ Dale
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Hi @v-jiascu-msft, thanks for your reply.

 

In fact we can further simplify. The "timestamp" does not matter here. The first column is just to indicate the order of the timeseries data. Your can think of it as an ordered index.

 

Source:

timestamp;string

1;a

2;a

3;(blank)

4;a

5;a

6;x

7;a

8;a

9;a

 

Result I am looking for:

timestamp;string

1;a

2;a

3;a

4;a

5;a

6;a

7;a

8;a

9;a

 

The "blank" and the "x" are errors to be identified by looking at the previous and following values in the series. They should be replaced by the most frequent value "in the neighbourhood". 

 

Thank you for giving it another thought.

Sorry to push here... any ideas? @v-jiascu-msft

 

I am trying to use this to narrow down the strings in proximity to the data gap... 

FILTER(Table1;Table1[Index]<=EARLIER(Table1[Index])+1 && Table1[Index]>=EARLIER(Table1[Index])-1)

 

I guess this could help me I do not succeed in putting it together in a calculated column:

 

https://community.powerbi.com/t5/Desktop/How-to-obtain-the-most-common-value-from-a-column-and-displ...

 

Most Frequent String = 
FIRSTNONBLANK (
    TOPN (
        1; 
        VALUES ( Table1[string] ); 
        RANKX( ALL( Table1[string] ); COUNTROWS(Table1);;ASC)
    ); 
    1 
)

Anyone?

strachi
Regular Visitor

Hi,

how can I smooth string values in a column?

 

I have time series data (timestamp; string) with some errors or gaps in it:

 

timestamp;string

1521585642;a

1521585643;a

1521585644;

1521585645;a

1521585646;a

1521585647;x

1521585648;a

1521585649;a

1521585650;a

 

I would like to fill the gap and replace the error ("x") with the values in proximity (lets say we want the most frequent value looking at the last 2 and next 2 values). You could call this moving average with strings. The result in this simple example would be all "a" in the string-column.

 

I feel like this comes close, but MAXA does not work with strings of course:

Smooth = 
CALCULATE (
    CALCULATE (
        MAXA( 'timeseries'[string] );
        'timeseries'[Datetime]
            >= VALUES ( 'timeseries'[Datetime] ) - 4 ;
        'timeseries'[Datetime] <= VALUES ( 'timeseries'[Datetime] )
    );
    ALLEXCEPT ( 'timeseries'; 'timeseries'[Tag];'timeseries'[Logfile];'timeseries'[Datetime] )
)

Any Ideas would be greatly appreciated. I was not able to find a solution.

 

 

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.