Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Xerovoid
Frequent Visitor

Measure formula using SUMX, FILTER and MAX/MIN

I am trying to better understand contexts with different formulas and I noticed a very strange behaviour. I created a random set of data to play with and when evaluating the below formula for measurements in a table the total was not the expected result.

 

MeasureMax = SUMX( filter(Table1, Table1[Category] = "c") , MAX(Table1[Value]) )
MeasureMin = SUMX( filter(Table1, Table1[Category] = "c") , MIN(Table1[Value]) )

  

I exposed the details using the Table visualizaton and if I select only the rows with a Category = to "c" i get the expected calculations that calculates the MAX/MIN values of the currently selected rows (filter context) and then because of the SUMX() function it will add those values together for each row context based on the selected rows.

 

However when not selecting any rows the total does not reflect that calculation and is calculating something extra. I added some total visualisation for each measure and when selecting the rows again (4, 7, 10) and then selecting from the top down I found something strange happen. Selecting the first row would impact the MIN calculation but not the MAX but any row after the MAX value in terms of Category C would then add a mysterious value of 30 for rows (8 and 9). As for the MIN calculation it would take the min value for rows that were not Category C (when selecting from the bottom to the top).

 

Another issue is that the Total row would highlight when including the first row in the selection which I think is misleading.

 

Here is the Query I used to create Table1

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45W0jUyUNJRSgRiQ6VYHSDfBMRPgvNNoLJGEB5IMhnONUZRaoZqkjlMLYRriGquJYwLMcnQFKE4FgA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Value = _t, Category = _t, #"Sub Category" = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Value", Int64.Type}, {"Category", type text}}),
    #"Added Index" = Table.AddIndexColumn(#"Changed Type", "Row", 1, 1),
    #"Sorted Rows" = Table.Sort(#"Added Index",{{"Row", Order.Ascending}})
in
    #"Sorted Rows"
1 ACCEPTED SOLUTION

Hey Herbet, thank you for your feedback. I appreciate it as it helps me better understand how the DAX language is functioning under the hood.

 

While I cannot dispute your solution provides the required results I still have concerns. While I cannot dispute your solution provides the required results I still have concerns.

 

The programmer in me screams loudly at the repetative nature trying to force the language to do what I thought it should be doing in the first place. To me it feels like a calculator that does not correctly multiple two numbers so the solution is to just solve it using addition (imagine calculating 4027 * 1024 in such a way).

 

I was under the impression that the SUMX(<table>, <expression>) function would evaluate the <expression> using the provided <table> much the same way that the CALCULATE(<expression>,<filter1>,<filter2>…)  function does.

 

Knowing this now I could reduce the context going into SUMX() and the only way to do that would be to leverage CALCULATE(). The reason I would prefer this method is when you get into complicated filters in real world examples that start to create very complex formulas.

 

MeasureMax = CALCULATE( SUMX( Table1 , MAX([Value]) ) , FILTER(Table1, Table1[Category] = "c") )
MeasureMin = CALCULATE( SUMX( Table1 , MIN([Value]) ) , FILTER(Table1, Table1[Category] = "c") )

Maybe I have misunderstood how the <expression> is being handled in terms of column based functions, I wonder if @ruthpozuelo might be willing to comment as an MVP?

 

View solution in original post

6 REPLIES 6
v-haibl-msft
Employee
Employee

@Xerovoid

 

What is your expected result? With your current DAX formula, the total of MeasureMin is -40 * 3, and total of MeasureMax is 90 * 3.

 

Measure formula using SUMX, FILTER and MAX MIN_1.jpg

 

Best Regards,
Herbert

My expected results is 210 for MeasureMax (70 * 3) and 45 for MeasureMin  (15 * 3).

 

Based on the documentation of the SUMX(<table>,<expression>) function you provide the <table> for which the <expression> is evaluated against. In the formula I expressed the <table> as a filtered context of the current filter context. It should be taking a subset of 'Table1' or the whole of 'Table1' if there are no page or selection filters being applied in the view. It then takes the resulting filtered 'Table1' and further filters it to only return rows where the Category is "c". So the resulting <table> should only be a rowset at most of rows [4, 7, 10].

 

The expression should then be evaluated against the values of that subset and in this case I chose to calculate the sum of the Max value, if however you dont evaluate a function and take just the [value] you get the expected total of 125.

 

MeasureSum = SUMX( filter(Table1, Table1[Category] = "c") , [Value] )

 

 

The other thing to point out is that if you look at the values in the other rows in the table the formula did not evaluate to anything, so then why is it suddenly ignoring the category filter?

 

What is really interesting is that when you remove the [Value] field it reduces the table to only the rows with a Category = "c" but still calculates total based on the Max/Min of the whole table ... my brain just cannot make sense of it.

 

Capture.JPG

Capture 2.JPG

@Xerovoid

 

To get the expected result, you need to make some changes to the second argument as below.

 

MeasureMax = 
SUMX (
    FILTER ( Table1, Table1[Category] = "c" ),
    CALCULATE ( MAX ( Table1[Value] ), FILTER ( Table1, Table1[Category] = "c" ) )
)
MeasureMin = 
SUMX (
    FILTER ( Table1, Table1[Category] = "c" ),
    CALCULATE ( MIN ( Table1[Value] ), FILTER ( Table1, Table1[Category] = "c" ) )
)

Measure formula using SUMX  FILTER and MAX MIN.jpg

 

We can also use other formulas like below.

 

Measure1 = 
CALCULATE ( MIN ( Table1[Value] ), FILTER ( Table1, Table1[Category] = "c" ) )
    * CALCULATE ( COUNTROWS ( Table1 ), FILTER ( Table1, Table1[Category] = "c" ) )
Measure2 = 
CALCULATE ( MAX ( Table1[Value] ), FILTER ( Table1, Table1[Category] = "c" ) )
    * CALCULATE ( COUNTROWS ( Table1 ), FILTER ( Table1, Table1[Category] = "c" ) )

 

Best Regards,
Herbert

Hey Herbet, thank you for your feedback. I appreciate it as it helps me better understand how the DAX language is functioning under the hood.

 

While I cannot dispute your solution provides the required results I still have concerns. While I cannot dispute your solution provides the required results I still have concerns.

 

The programmer in me screams loudly at the repetative nature trying to force the language to do what I thought it should be doing in the first place. To me it feels like a calculator that does not correctly multiple two numbers so the solution is to just solve it using addition (imagine calculating 4027 * 1024 in such a way).

 

I was under the impression that the SUMX(<table>, <expression>) function would evaluate the <expression> using the provided <table> much the same way that the CALCULATE(<expression>,<filter1>,<filter2>…)  function does.

 

Knowing this now I could reduce the context going into SUMX() and the only way to do that would be to leverage CALCULATE(). The reason I would prefer this method is when you get into complicated filters in real world examples that start to create very complex formulas.

 

MeasureMax = CALCULATE( SUMX( Table1 , MAX([Value]) ) , FILTER(Table1, Table1[Category] = "c") )
MeasureMin = CALCULATE( SUMX( Table1 , MIN([Value]) ) , FILTER(Table1, Table1[Category] = "c") )

Maybe I have misunderstood how the <expression> is being handled in terms of column based functions, I wonder if @ruthpozuelo might be willing to comment as an MVP?

 

@Xerovoid

 

Maybe you can take a look at this blog, it mentions that "Step one should always be to create measures that you want to use in the expression argument of your SUMX measure". I think it should be helpful to understand the function.

 

Best Regards,
Herbert

@v-haibl-msft

 

So reading that article suggests how SUMX() should be working in that the given expression is evaluated row by row against the provided table and then sums up the result of each row to output a final result.

 

What I have been trying to highlight is that SUMX() does not fully behave this way when your expression is column based function like SUM()/MAX()/MIN().

 

As for the approach suggested in the article, I like the idea of breaking complex calculations up into multiple fields but sadly the Microsoft engineers are doing some hidden extra magic under the hood.

 

If you take the formula

 

MeasureMax = SUMX( Table1, Max([Value]) )

 

Modifying to have an extra Measure Column

 

MaxValue = Max([Value])
MeasureMax = SUMX( Table1 , [MaxValue] )
MeasureMaxEquivalent = SUMX( Table1 , Calculate(Max([Value]) )

You dont get the expected result. Instead what is happening is the measure column being wrapped in a Calculate() which now breaks the intention of calculating the max of the given table for each row and then adding the result of each row together. Now its just simulating SUM() which as the article states is less expensive as a straight O(n) function but SUMX() is a O(n^c) function and so the more rows it has the harder the calculation is to be processed.

 

 

What does irk me is when the SUMX function is given a subset of a table it iterates according to that subset but the context provided to the expression is the filter context outside of the SUMX function. This is why you need to first reduce the filter context of the table using calculate before calling SUMX function.

 

I know this is a rare case and not a typical calculation you would use only in rare cases. But I have such a case where I needed to evaluate the maximum value based on multiple entries in a large list in order to evaluate if that was the most recent value. The reason I want to do that using a measure is because if I filter the view that excludes the latest version I need the formula to correctly evalute given the filtered table what is the current maxium value in the filtered view. I then want a formula to evaluate those items in order to score so that other consumers who are not as familiar with the data can just drag and drop the measures and be guaranteed the correct calculation.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.