Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Scenario:
I would like to get only one max value for each student ID. For example, a company needs to examine each person's attendance and identify the arrival time for being late monthly, but for the clarity of the report, it is required that if there is a duplication of the arrival time for being late, only one is identified.
Another example is a sales company that needs to mark the maximum monthly sales of each salesperson on the sales sheet and display the data during meetings. At this time, the maximum sales are duplicated, and if they are all marked, it will affect the beauty of the whole report, so only one of the duplicated maximum values should be randomly marked.
Table Used:
I have the following Table that contains the student ID and their mark:
Here is the expect output:
DAX:
If only get the max marks of each student, It is very easy. Just using the below dax, you will get the output.
Output = IF(
RANKX (
FILTER ( 'Table', 'Table'[Student ID] = EARLIER ( 'Table'[Student ID] ) ),
'Table'[Mark],
,
DESC,
DENSE
)=1,"Max",BLANK())
But how to return one “Max” when the StudentID=4?
Solution 1,
Step 1,
Index add:( If the connection mode you use is Direct Query and Live Connections, it is recommended to do the relevant operations in the data source. If the data connection mode is import, you can add indexes by referring to the chart below.)
Step 2,
Then try to get the maximum value of each group of mark and get the maximum index of each group at the same time, and after getting that, you can rank by the largest summed value.
Max Mark =
IF(
RANKX (
FILTER ( 'Table', 'Table'[Student ID] = EARLIER ( 'Table'[Student ID] ) ),
'Table'[Mark],
,
DESC,
DENSE
)=1,1,-999)+
RANKX (
FILTER ( 'Table', 'Table'[Student ID] = EARLIER ( 'Table'[Student ID] ) ),
'Table'[Index],
,
DESC,
DENSE
)
In the above dax, when I get each group of max value to rank, the max mark of each group is 1 and other mark will be -999. To make it to be more clear, I will directly add the order of each group of mark maximum and the maximum value of each group of index together, the maximum value cannot be accurately determined because the results are close.
To get the maximum, we need to rank based on the result after sorting the maximum value of mark. The index is just to return only one value when the maximum value appears the duplicated data.
Step 3,
Finally, we can use the below dax:
MAX Mark2 = IF(RANKX (
FILTER ( 'Table', 'Table'[Student ID] = EARLIER ( 'Table'[Student ID] ) ),
'Table'[Max Mark],
,
DESC,
DENSE
)=1,"Max",BLANK())
And is there a way to create only one column (except index) to output the value? The answer is in Solution 2.
Solution 2,
Based on the above logic, the following dax can create only one column according to the index to get the result.
Out = IF(
RANKX( FILTER ( 'Table', 'Table'[Student ID] = EARLIER ( 'Table'[Student ID] ) ),if(
RANKX (
FILTER ( 'Table', 'Table'[Student ID] = EARLIER ( 'Table'[Student ID] ) ),
'Table'[Mark],
,
DESC,
DENSE
)=1,1,-999)+ RANKX (
FILTER (
'Table',
'Table'[Student ID] = EARLIER ('Table'[Student ID])
) ,
'Table'[Index],
,
DESC,
DENSE
),,DESC,Dense)=1,"Max",BLANK())
And the result is as below:
And this dax also can be applied to rank a table based on mutiple values.
Solution 3,
Power Query:
One way to solve this and similar problems is to do the following process. Do a Group By on the column which you want to duplicate the calculation on, in this case, Student ID is column we want to duplicate the calculation. We can use the All Rows aggregation operation. This will produce a table with one row per month and a column containing nested tables. It is shown as below.
Each one cell of these tables contains the rows of Student ID from the original table. You can then transform your original table into a function, either in a separate query or as a step in your current query. Here’s an example of how the query below can be turned into a function that gets a table and returns a table with a rank column added:
RankFunction = (tabletorank as table) as table =>
let
SortRows = Table.Sort(tabletorank,{{"Mark", Order.Descending}}),
AddIndex = Table.AddIndexColumn(SortRows, "Rank", 1, 1)
in
AddIndex
Next, you need to pass each nested table to this function. You could do that in a calculated column, but the most applicable way is to use the Table.TransformColumns() function which takes a function and applies it to every value in a column.
Finally, you get the final output by clicking on the Expand icon in the AllRows column and then choosing to expand all the columns in the nested table except the ones you originally grouped on:
Here’s the full M code:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTI0VYrVgTCNDOBMqKgRiGmCYBojmBZgpjGIaYZgmiOYlmCmCYhphMKMBQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Student ID" = _t, Mark = _t]),
//Group by Month
Grouped = Table.Group(Source, {"Student ID"}, {{"AllRows", each _, type table}}),
//Declare a function that adds a Rank column to a table
RankFunction = (tabletorank as table) as table =>
let
SortRows = Table.Sort(tabletorank,{{"Mark", Order.Descending}}),
AddIndex = Table.AddIndexColumn(SortRows, "Rank", 1, 1)
in
AddIndex,
//Apply that function to the AllRows column
AddedRank = Table.TransformColumns(Grouped, {"AllRows", each RankFunction(_)}),
#"Expanded AllRows1" = Table.ExpandTableColumn(AddedRank, "AllRows", {"Mark", "Rank"}, {"AllRows.Mark", "AllRows.Rank"}),
#"Added Custom" = Table.AddColumn(#"Expanded AllRows1", "Final", each if [AllRows.Rank] = 1 then "Max" else null)
in
#"Added Custom"
Conclusion:
Currently, there are plenty of different methods, some can give us expanded perspective. Hope this article helps everyone who encounters similar questions.
Author: Lucien Wang
Reviewer: Ula Huang, Kerry Wang
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.