cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Mark88 Frequent Visitor
Frequent Visitor

Filtering table rows on outliers

One of our machines is measuring the temperature development of certain articles. Problem is that we sometimes have 'outliers' in our measurements of the temperature. Now we want to filter out these outliers to analyze where they are caused.

 

The whole file has about 5M rows and contains many articles. The temperature measurements take place every 10 minutes. Question is: How can I create a formula in which I ask Power BI to:

 

- filter out the rows with an identical ArticleNr

- place the measurements in chronological order (so every ten minutes)

- filter out the measurements that differ 10 degrees opposed to the previouw measure (so +10 degrees or - 10 degrees), these are the outliers we want to filter out and analyze

 

ArticleNrMeasurementDateTemperature
128960078585555-2-2018 11:4021
128960078585555-2-2018 11:5021
128960078585555-2-2018 12:0020,8
128960078585555-2-2018 12:1020,8
128960078585555-2-2018 12:205,8
128960078585555-2-2018 12:3020,8
128960078585555-2-2018 12:4020,8
128960078585555-2-2018 12:5021
128960078585555-2-2018 13:0020,9
128960078585555-2-2018 13:1020,9
1289600786419926-2-2018 00:4020,1
1289600786419926-2-2018 00:5020,3
1289600786419926-2-2018 01:0020,1
1289600786419926-2-2018 01:1020,3
1289600786419926-2-2018 01:2054,8
1289600786419926-2-2018 01:3020,1
1289600786419926-2-2018 01:4019,9
1289600786419926-2-2018 01:5020,1
1289600786419926-2-2018 02:0019,9
1289600786419926-2-2018 02:1019,9

 

 

Above example shows two different article numbers. In both measurements is an outlier in Bold.

 

Can you please advise if it's possible to filter out the outliers in Power BI?

1 ACCEPTED SOLUTION

Accepted Solutions
v-danhe-msft Super Contributor
Super Contributor

Re: Filtering table rows on outliers

Hi ,

Based on my test, you can modify the column [pre10minvalue] I have created in the previous pbix.

pre10minvalue =

var previousweek = Sheet1[pre10min]

var saleonpre = CALCULATE(SUM(Sheet1[Temperature]),FILTER(Sheet1,Sheet1[MeasurementDate] = previousweek && Sheet1[ArticleNr]=EARLIER(Sheet1[ArticleNr])))

return

IF(ISBLANK(saleonpre),0,saleonpre)

A.PNG

Now you can see the correct result.

You can also download the PBIX file to have a view.

https://www.dropbox.com/s/y1l6phnzvo7zw0o/Filtering%20table%20rows%20on%20outliers.pbix?dl=0

 

Regards,

Daniel He

Community Support Team _ Daniel He
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

4 REPLIES 4
v-danhe-msft Super Contributor
Super Contributor

Re: Filtering table rows on outliers

Hi @Mark88,

Based on my test, you can follow below steps:

1.I have entered some sample data to test for your problem.

A.PNG

2.Create three calculated columns to calculate your [ProblemValue].

pre10min = [MeasurementDate]-10/1440

pre10minvalue =

var previousweek = Sheet1[pre10min]

var saleonpre = CALCULATE(SUM(Sheet1[Temperature]),ALL(Sheet1),Sheet1[MeasurementDate] = previousweek)

return

IF(ISBLANK(saleonpre),0,saleonpre)

ProblemValue = IF(([Temperature]-[pre10minvalue])=[Temperature],BLANK(),IF(([Temperature]-[pre10minvalue])>10,[Temperature],IF(([Temperature]-[pre10minvalue])<-10,[Temperature])))

B.PNG

3.Create a Slicer visual and add the [ArticleNr] field to filter out the rows with an identical ArticleNr.

C.PNG

4.Create a Table visual and add the [ArticleNr], [MeasurementData], [Temperature] and the [ProblemValue].

D.PNG

5.You can use the sort function in the table visual to order the [MeasurementData].

E.PNG

You can also download the PBIX file to have a view.

https://www.dropbox.com/s/y1l6phnzvo7zw0o/Filtering%20table%20rows%20on%20outliers.pbix?dl=0

 

Regards,

Daniel He

Community Support Team _ Daniel He
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Mark88 Frequent Visitor
Frequent Visitor

Re: Filtering table rows on outliers

Hi @v-danhe-msft,

 

Thanks for your reply! When testing this solution on our 5M row file I faced a problem. Our machine is measuring multiple articles on the same time, so we have duplicate MeasurementsDates in the file. The formula you've created in the 'pre10minvalue' column works perfect when measured in unique time intervals (unique MeasurementDates). However, our machine is sometimes measuring 5 ArticleNr in the same time interval and your formula sums up the temperature per time interval. See the example below:

 

ArticleNrMeasurementDateTemperaturepre10minpre10minvalue
128960078585555-2-2018 11:40215-2-2018 11:300
128960078585555-2-2018 11:50215-2-2018 11:4021
128960078585555-2-2018 12:0020,85-2-2018 11:5021
128960078585555-2-2018 12:1020,85-2-2018 12:0020,8
128960078585555-2-2018 12:205,85-2-2018 12:1020,8
128960078585555-2-2018 12:3020,85-2-2018 12:205,8
128960078585555-2-2018 12:4020,85-2-2018 12:3020,8
128960078585555-2-2018 12:50215-2-2018 12:4020,8
128960078585555-2-2018 13:0020,95-2-2018 12:5021
128960078585555-2-2018 13:1020,95-2-2018 13:0020,9
1289600786419926-2-2018 00:4020,126-2-2018 00:300
1289600786419926-2-2018 00:5020,326-2-2018 00:4020,1
1289600786419926-2-2018 01:0020,126-2-2018 00:5020,3
1289600786419926-2-2018 01:1020,326-2-2018 01:0020,1
1289600786419926-2-2018 01:2054,826-2-2018 01:1020,3
1289600786419926-2-2018 01:3020,126-2-2018 01:2054,8
1289600786419926-2-2018 01:4019,926-2-2018 01:3020,1
1289600786419926-2-2018 01:5020,126-2-2018 01:4043,7
1289600786419926-2-2018 02:0019,926-2-2018 01:5020,1
1289600786419926-2-2018 02:1019,926-2-2018 02:0019,9
1289600786418826-2-2018 01:3623,626-2-2018 01:260
1289600786418826-2-2018 01:4023,826-2-2018 01:3020,1

 

The rows where the problem starts are highlighted with an underscore. You see that it sums up the temperature for that unique time interval. What I am looking for is the temperature for a unique ArticleNR+MeasurementDate combination.

 

Can you please advise how I can fix this?

v-danhe-msft Super Contributor
Super Contributor

Re: Filtering table rows on outliers

Hi ,

Based on my test, you can modify the column [pre10minvalue] I have created in the previous pbix.

pre10minvalue =

var previousweek = Sheet1[pre10min]

var saleonpre = CALCULATE(SUM(Sheet1[Temperature]),FILTER(Sheet1,Sheet1[MeasurementDate] = previousweek && Sheet1[ArticleNr]=EARLIER(Sheet1[ArticleNr])))

return

IF(ISBLANK(saleonpre),0,saleonpre)

A.PNG

Now you can see the correct result.

You can also download the PBIX file to have a view.

https://www.dropbox.com/s/y1l6phnzvo7zw0o/Filtering%20table%20rows%20on%20outliers.pbix?dl=0

 

Regards,

Daniel He

Community Support Team _ Daniel He
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

Mark88 Frequent Visitor
Frequent Visitor

Re: Filtering table rows on outliers

Many thanks for your help @v-danhe-msft!! This solved the issue.

Helpful resources

Announcements
October 2019 Community Highlights

October 2019 Community Highlights

October was a busy month in the community. Read the recap article to learn about some of the events and content.

New Solution Badges

New Solution Badges

Two waves of brand new solution badges are coming! Read the article for more information on our new community badges.

Ask Amir Anything

Exclusive LIVE Community Event No. 2 – Ask Amir Anything

Next in our Triple A series: Ask Amir Netz questions about the latest updates, features and future.

Analytics in Azure virtual event

Analytics in Azure virtual event

Experience a limitless analytics service built to ingest, prep, manage, and serve data for immediate use in Power BI.

Users Online
Currently online: 141 members 1,669 guests
Please welcome our newest community members: