Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
I've been asked to work out if its possible to do a couple of things in Power BI, the first being rebasing a columns data based on the median value of the first 30 rows of data for each unique Component.
So, I have a table called ComponentData that contains a row of data for each time a component has been used - this is used to track the components degredation over time. This table contains data from multiple components.
Example of data:
Id | Value | Uses |
1 | 55 | 1 |
1 | 50 | 2 |
1 | 48 | 3 |
1 | 43 | 5 |
2 | 67 | 1 |
2 | 66 | 2 |
2 | 64 | 3 |
2 | 63 | 4 |
2 | 61 | 5 |
So somehow I need to be able to be able to select the first 30 rows of data (for each unique component) and calculate the median value of those rows, then I need a new column which deductions the median just calculated from each value in Value column.
Is this even possible?
The second item is a new column which calculates a moving avarage of the Value column. This first 30 values in this new column would be null, after that point the value would equal an average of the last 30 rows. This is used to smooth out the data. Again I have no idea is power bi can do this?
Any help would be more than welcome
Solved! Go to Solution.
Ok sorry for the spam, I think this is a much better solution, unless you have many other uses for the index column created in my last post. Use the same index column as in my initial post (ignoring ID) and then use these two formulas in DAX:
MinusMedian2 = Table1[Value] - MEDIANX(TOPN(3, FILTER(ALL(Table1), Table1[Id] = EARLIER(Table1[Id])), Table1[Index], ASC), VALUE(Table1[Value]))
Moving2 =
IF(COUNTROWS(FILTER(Table1, Table1[Index] < EARLIER(Table1[Index]) && Table1[Id] = EARLIER(Table1[Id]))) <= 2, BLANK(),
AVERAGEX(TOPN(3, FILTER(Table1, Table1[Index] < EARLIER(Table1[Index]) && Table1[Id] = EARLIER(Table1[Id])), Table1[Index.1], DESC), Table1[Value]))
You can probably tell that I decreased the threshhold from 5 to 3 to get some results, so replace any instance of 2 with 29 and 3 with 30
Hi Andy,
This is possible, to start I would say make an index column (this is easiest in query editor, so I'll show it there):
Then in DAX you can do something like this for your first question:
MinusMedian = Table1[Value] - MEDIANX(FILTER(ALL(Table1), Table1[Index] <= 4), VALUE(Table1[Value]))
And this for your second:
Moving =
IF(Table1[Index] <= 4, BLANK(),
AVERAGEX(FILTER(Table1, Table1[Index] < EARLIER(Table1[Index]) && Table1[Index] >= EARLIER(Table1[Index]) - 5), Table1[Value]))
Here's the resulting table I get:
Note that I used 5 instead of 30 to get meaningful results on your small sample dataset. Anywhere you see a 5, replace with 30, and 4 with 29.
Hi Jahida,
Thanks for the reply - this works great if there is only a single component type in the table, i.e a component with an Id of 1
I have provided a sample set of data that is perhaps to small to actually work with, sorry.
What I've got is data for multiple components in my table - this is so I can plot a line for each component type and see the degredation against multiple components in a line chart.
So for each component ( unique by the Id column) I need a running average and a median rebase.
Using your index method when I get to the component with an Id of 2 - I would need the index to start again at the begining for your method to work I beleive?
Hi Andyr,
Sorry, I can't read.
I'm much better at DAX than M but I gave making the index column a shot, and think I got it, although it's a bit convoluted. Here's the code I used:
Partitioned = Table.Partition (#"Changed Type2", "Id", List.Max(Table.Column(#"Changed Type2", "Id")), each _), #"Added Index" = List.Transform(Partitioned, each {List.Max(Table.Column(_, "Id")), Table.RemoveColumns(Table.AddIndexColumn(_, "Index"), "Id")}), #"Converted to Table" = Table.FromPartitions("Id", #"Added Index") in #"Converted to Table"
A key thing to note is that this formula makes sense for Ids that are reasonably low, numeric, and positive (the example you gave me of 1 and 2), but would need to be adapted if the IDs don't meet that criteria. I don't know what types of IDs you're looking at, if they're low and include zero there's an easy fix (just add 1 in the partition function and the number of partitions), if they're not low then it'll take a bit more work.
MinusMedian = Table1[Value] - MEDIANX(FILTER(ALL(Table1), Table1[Index] <= 2 && Table1[Id] = EARLIER(Table1[Id])), VALUE(Table1[Value]))
Moving =
IF(Table1[Index] <= 2, BLANK(),
AVERAGEX(FILTER(Table1, Table1[Index] < EARLIER(Table1[Index]) && Table1[Index] >= EARLIER(Table1[Index]) - 3 && Table1[Id] = EARLIER(Table1[Id])), Table1[Value]))
When I get time a bit later I'll work through a more DAX-heavy solution, which should be more flexible in Ids (planning on using TOPN in addition to the filters so that the previous index column is sufficient).
Ok sorry for the spam, I think this is a much better solution, unless you have many other uses for the index column created in my last post. Use the same index column as in my initial post (ignoring ID) and then use these two formulas in DAX:
MinusMedian2 = Table1[Value] - MEDIANX(TOPN(3, FILTER(ALL(Table1), Table1[Id] = EARLIER(Table1[Id])), Table1[Index], ASC), VALUE(Table1[Value]))
Moving2 =
IF(COUNTROWS(FILTER(Table1, Table1[Index] < EARLIER(Table1[Index]) && Table1[Id] = EARLIER(Table1[Id]))) <= 2, BLANK(),
AVERAGEX(TOPN(3, FILTER(Table1, Table1[Index] < EARLIER(Table1[Index]) && Table1[Id] = EARLIER(Table1[Id])), Table1[Index.1], DESC), Table1[Value]))
You can probably tell that I decreased the threshhold from 5 to 3 to get some results, so replace any instance of 2 with 29 and 3 with 30
Hi Jahida,
Thanks - the MinusMedian2 is working spot one I believe!
The Moving2 seems to have a typo in index.1 - I've replaced this with index but it is just returns Id's - I've ended up pre processing this one with an R script in the end...
Thanks for the help!
You'r right, Index.1 was a typo and should be Index. Weird it doesn't work for you, it seems fine on my end... If the R script works for you, great, if not, maybe you could give me a bit more info and I could try to debug.
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
97 | |
96 | |
81 | |
74 | |
66 |
User | Count |
---|---|
126 | |
106 | |
105 | |
86 | |
72 |