Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
Mark88
Frequent Visitor

Filtering table rows on outliers

One of our machines is measuring the temperature development of certain articles. Problem is that we sometimes have 'outliers' in our measurements of the temperature. Now we want to filter out these outliers to analyze where they are caused.

 

The whole file has about 5M rows and contains many articles. The temperature measurements take place every 10 minutes. Question is: How can I create a formula in which I ask Power BI to:

 

- filter out the rows with an identical ArticleNr

- place the measurements in chronological order (so every ten minutes)

- filter out the measurements that differ 10 degrees opposed to the previouw measure (so +10 degrees or - 10 degrees), these are the outliers we want to filter out and analyze

 

ArticleNrMeasurementDateTemperature
128960078585555-2-2018 11:4021
128960078585555-2-2018 11:5021
128960078585555-2-2018 12:0020,8
128960078585555-2-2018 12:1020,8
128960078585555-2-2018 12:205,8
128960078585555-2-2018 12:3020,8
128960078585555-2-2018 12:4020,8
128960078585555-2-2018 12:5021
128960078585555-2-2018 13:0020,9
128960078585555-2-2018 13:1020,9
1289600786419926-2-2018 00:4020,1
1289600786419926-2-2018 00:5020,3
1289600786419926-2-2018 01:0020,1
1289600786419926-2-2018 01:1020,3
1289600786419926-2-2018 01:2054,8
1289600786419926-2-2018 01:3020,1
1289600786419926-2-2018 01:4019,9
1289600786419926-2-2018 01:5020,1
1289600786419926-2-2018 02:0019,9
1289600786419926-2-2018 02:1019,9

 

 

Above example shows two different article numbers. In both measurements is an outlier in Bold.

 

Can you please advise if it's possible to filter out the outliers in Power BI?

1 ACCEPTED SOLUTION

Hi ,

Based on my test, you can modify the column [pre10minvalue] I have created in the previous pbix.

pre10minvalue =

var previousweek = Sheet1[pre10min]

var saleonpre = CALCULATE(SUM(Sheet1[Temperature]),FILTER(Sheet1,Sheet1[MeasurementDate] = previousweek && Sheet1[ArticleNr]=EARLIER(Sheet1[ArticleNr])))

return

IF(ISBLANK(saleonpre),0,saleonpre)

A.PNG

Now you can see the correct result.

You can also download the PBIX file to have a view.

https://www.dropbox.com/s/y1l6phnzvo7zw0o/Filtering%20table%20rows%20on%20outliers.pbix?dl=0

 

Regards,

Daniel He

Community Support Team _ Daniel He
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

4 REPLIES 4
v-danhe-msft
Employee
Employee

Hi @Mark88,

Based on my test, you can follow below steps:

1.I have entered some sample data to test for your problem.

A.PNG

2.Create three calculated columns to calculate your [ProblemValue].

pre10min = [MeasurementDate]-10/1440

pre10minvalue =

var previousweek = Sheet1[pre10min]

var saleonpre = CALCULATE(SUM(Sheet1[Temperature]),ALL(Sheet1),Sheet1[MeasurementDate] = previousweek)

return

IF(ISBLANK(saleonpre),0,saleonpre)

ProblemValue = IF(([Temperature]-[pre10minvalue])=[Temperature],BLANK(),IF(([Temperature]-[pre10minvalue])>10,[Temperature],IF(([Temperature]-[pre10minvalue])<-10,[Temperature])))

B.PNG

3.Create a Slicer visual and add the [ArticleNr] field to filter out the rows with an identical ArticleNr.

C.PNG

4.Create a Table visual and add the [ArticleNr], [MeasurementData], [Temperature] and the [ProblemValue].

D.PNG

5.You can use the sort function in the table visual to order the [MeasurementData].

E.PNG

You can also download the PBIX file to have a view.

https://www.dropbox.com/s/y1l6phnzvo7zw0o/Filtering%20table%20rows%20on%20outliers.pbix?dl=0

 

Regards,

Daniel He

Community Support Team _ Daniel He
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Hi @v-danhe-msft,

 

Thanks for your reply! When testing this solution on our 5M row file I faced a problem. Our machine is measuring multiple articles on the same time, so we have duplicate MeasurementsDates in the file. The formula you've created in the 'pre10minvalue' column works perfect when measured in unique time intervals (unique MeasurementDates). However, our machine is sometimes measuring 5 ArticleNr in the same time interval and your formula sums up the temperature per time interval. See the example below:

 

ArticleNrMeasurementDateTemperaturepre10minpre10minvalue
128960078585555-2-2018 11:40215-2-2018 11:300
128960078585555-2-2018 11:50215-2-2018 11:4021
128960078585555-2-2018 12:0020,85-2-2018 11:5021
128960078585555-2-2018 12:1020,85-2-2018 12:0020,8
128960078585555-2-2018 12:205,85-2-2018 12:1020,8
128960078585555-2-2018 12:3020,85-2-2018 12:205,8
128960078585555-2-2018 12:4020,85-2-2018 12:3020,8
128960078585555-2-2018 12:50215-2-2018 12:4020,8
128960078585555-2-2018 13:0020,95-2-2018 12:5021
128960078585555-2-2018 13:1020,95-2-2018 13:0020,9
1289600786419926-2-2018 00:4020,126-2-2018 00:300
1289600786419926-2-2018 00:5020,326-2-2018 00:4020,1
1289600786419926-2-2018 01:0020,126-2-2018 00:5020,3
1289600786419926-2-2018 01:1020,326-2-2018 01:0020,1
1289600786419926-2-2018 01:2054,826-2-2018 01:1020,3
1289600786419926-2-2018 01:3020,126-2-2018 01:2054,8
1289600786419926-2-2018 01:4019,926-2-2018 01:3020,1
1289600786419926-2-2018 01:5020,126-2-2018 01:4043,7
1289600786419926-2-2018 02:0019,926-2-2018 01:5020,1
1289600786419926-2-2018 02:1019,926-2-2018 02:0019,9
1289600786418826-2-2018 01:3623,626-2-2018 01:260
1289600786418826-2-2018 01:4023,826-2-2018 01:3020,1

 

The rows where the problem starts are highlighted with an underscore. You see that it sums up the temperature for that unique time interval. What I am looking for is the temperature for a unique ArticleNR+MeasurementDate combination.

 

Can you please advise how I can fix this?

Hi ,

Based on my test, you can modify the column [pre10minvalue] I have created in the previous pbix.

pre10minvalue =

var previousweek = Sheet1[pre10min]

var saleonpre = CALCULATE(SUM(Sheet1[Temperature]),FILTER(Sheet1,Sheet1[MeasurementDate] = previousweek && Sheet1[ArticleNr]=EARLIER(Sheet1[ArticleNr])))

return

IF(ISBLANK(saleonpre),0,saleonpre)

A.PNG

Now you can see the correct result.

You can also download the PBIX file to have a view.

https://www.dropbox.com/s/y1l6phnzvo7zw0o/Filtering%20table%20rows%20on%20outliers.pbix?dl=0

 

Regards,

Daniel He

Community Support Team _ Daniel He
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Many thanks for your help @v-danhe-msft!! This solved the issue.

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.