Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

V-pazhen-msft

Sort when with same value

Scenario: 

I would like to get only one max value for each student ID. For example, a company needs to examine each person's attendance and identify the arrival time for being late monthly, but for the clarity of the report, it is required that if there is a duplication of the arrival time for being late, only one is identified.

 

Another example is a sales company that needs to mark the maximum monthly sales of each salesperson on the sales sheet and display the data during meetings. At this time, the maximum sales are duplicated, and if they are all marked, it will affect the beauty of the whole report, so only one of the duplicated maximum values should be randomly marked.

 

Table Used: 

I have the following Table that contains the student ID and their mark

Vpazhenmsft_0-1628752576657.png

 

Here is the expect output:

Vpazhenmsft_1-1628752576661.png

DAX:

If only get the max marks of each student, It is very easy. Just using the below dax, you will get the output.

 

Output = IF(

    RANKX (

        FILTER ( 'Table', 'Table'[Student ID] = EARLIER ( 'Table'[Student ID] ) ),

        'Table'[Mark],

        ,

        DESC,

        DENSE

)=1,"Max",BLANK())

Vpazhenmsft_2-1628752576663.png

 

But how to return one “Max” when the StudentID=4?

Solution 1,

Step 1,

Index add:( If the connection mode you use is Direct Query and Live Connections, it is recommended to do the relevant operations in the data source. If the data connection mode is import, you can add indexes by referring to the chart below.)

Vpazhenmsft_3-1628752576665.png

 

Step 2,

Then try to get the maximum value of each group of mark and get the maximum index of each group at the same time, and after getting that, you can rank by the largest summed value.

Max Mark =

IF(

    RANKX (

        FILTER ( 'Table', 'Table'[Student ID] = EARLIER ( 'Table'[Student ID] ) ),

        'Table'[Mark],

        ,

        DESC,

        DENSE

    )=1,1,-999)+

    RANKX (

        FILTER ( 'Table', 'Table'[Student ID] = EARLIER ( 'Table'[Student ID] ) ),

        'Table'[Index],

        ,

        DESC,

        DENSE   

    )

In the above dax, when I get each group of max value to rank, the max mark of each group is 1 and other mark will be -999. To make it to be more clear, I will directly add the order of each group of mark maximum and the maximum value of each group of index together, the maximum value cannot be accurately determined because the results are close.

Vpazhenmsft_4-1628752576671.png

To get the maximum, we need to rank based on the result after sorting the maximum value of mark. The index is just to return only one value when the maximum value appears the duplicated data.

 

Step 3,

Finally, we can use the below dax:

MAX Mark2 = IF(RANKX (

        FILTER ( 'Table', 'Table'[Student ID] = EARLIER ( 'Table'[Student ID] ) ),

       'Table'[Max Mark],

        ,

        DESC,

        DENSE

    )=1,"Max",BLANK())

 

Vpazhenmsft_5-1628752576672.png

And is there a way to create only one column (except index) to output the value? The answer is in Solution 2.

 

Solution 2,

Based on the above logic, the following dax can create only one column according to the index to get the result.

Out = IF(

RANKX( FILTER ( 'Table', 'Table'[Student ID] = EARLIER ( 'Table'[Student ID] ) ),if(

    RANKX (

        FILTER ( 'Table', 'Table'[Student ID] = EARLIER ( 'Table'[Student ID] ) ),

        'Table'[Mark],

        ,

        DESC,

        DENSE

    )=1,1,-999)+ RANKX (

            FILTER (

                 'Table',

                'Table'[Student ID] = EARLIER ('Table'[Student ID])

                  

            ) ,

            'Table'[Index],

            ,

            DESC,

            DENSE

        ),,DESC,Dense)=1,"Max",BLANK())

 

And the result is as below:

Vpazhenmsft_6-1628752576675.png

 

 

And this dax also can be applied to rank a table based on mutiple values.

 

Solution 3,

Power Query

One way to solve this and similar problems is to do the following process. Do a Group By on the column which you want to duplicate the calculation on, in this case, Student ID is column we want to duplicate the calculation. We can use the All Rows aggregation operation. This will produce a table with one row per month and a column containing nested tables. It is shown as below.

 

Vpazhenmsft_7-1628752576676.png

Each one cell of these tables contains the rows of Student ID from the original table. You can then transform your original table into a function, either in a separate query or as a step in your current query. Here’s an example of how the query below can be turned into a function that gets a table and returns a table with a rank column added:

RankFunction = (tabletorank as table) as table =>

     let

      SortRows = Table.Sort(tabletorank,{{"Mark", Order.Descending}}),

      AddIndex = Table.AddIndexColumn(SortRows, "Rank", 1, 1)

     in

      AddIndex

Next, you need to pass each nested table to this function. You could do that in a calculated column, but the most applicable way is to use the Table.TransformColumns() function which takes a function and applies it to every value in a column.

Finally, you get the final output by clicking on the Expand icon in the AllRows column and then choosing to expand all the columns in the nested table except the ones you originally grouped on:

Vpazhenmsft_8-1628752576676.png

 

Here’s the full M code:

let

    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTI0VYrVgTCNDOBMqKgRiGmCYBojmBZgpjGIaYZgmiOYlmCmCYhphMKMBQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Student ID" = _t, Mark = _t]),

    //Group by Month

    Grouped = Table.Group(Source, {"Student ID"}, {{"AllRows", each _, type table}}),

    //Declare a function that adds a Rank column to a table

    RankFunction = (tabletorank as table) as table =>

     let

      SortRows = Table.Sort(tabletorank,{{"Mark", Order.Descending}}),

      AddIndex = Table.AddIndexColumn(SortRows, "Rank", 1, 1)

     in

      AddIndex,

    //Apply that function to the AllRows column

    AddedRank = Table.TransformColumns(Grouped, {"AllRows", each RankFunction(_)}),

    #"Expanded AllRows1" = Table.ExpandTableColumn(AddedRank, "AllRows", {"Mark", "Rank"}, {"AllRows.Mark", "AllRows.Rank"}),

    #"Added Custom" = Table.AddColumn(#"Expanded AllRows1", "Final", each if [AllRows.Rank] = 1 then "Max" else null)

in

    #"Added Custom"

Vpazhenmsft_9-1628752576677.png

 

Conclusion: 

Currently, there are plenty of different methods, some can give us expanded perspective. Hope this article helps everyone who encounters similar questions. 

 

Author: Lucien Wang

Reviewer: Ula Huang, Kerry Wang