Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi experts,
I have been experimenting with expanded tabels and came up to this strange behaviour, which hopefully somebody can explain. I built two simple tables:
They are related via the ProductID field. To my best understanding expanded table looks like this:
Now I've build a couple of measures to sum Amount by ProductID. However I get different results in the following two cases :
Maybe you know why is that? I was always thinking that you can filter on any of the fields of an expanded table and the results should be the same.
Solved! Go to Solution.
Your confusion lies in the fact that even though an expanded table is a series of left-outer joins of tables from many -> 1, the expanded table still has all the columns of all the tables. And tables are a collection of independent columns. Meaning that a filter on one column does not affect another until after applied to the model. (and even then the impact if from cross filtering rows) Take this code here:
Sum Amount = CALCULATE(SUM(t3Sales[Amount]),t3ProductName[ProductID])
internally, it is written as this:
Sum Amount = CALCULATE ( SUM ( t3Sales[Amount] ), FILTER ( ALL ( t3ProductName[ProductID] ), t3ProductName[ProductID] ) )
So it will return all the ProductID's from the t3ProductName column. But the table also has a filter on t3Sales[ProductID] because you put it on the rows which when combined filters down the t3Sales table to just where t3Sales[ProductID] = 100 for row 1, and so on for rest of the table. Grand total has no filter on t3Sales[ProductID] and only has the above filter on tsProductName[ProductID] column which returns all values, so you get the correct grand total.
next is your Sum Amount2
Sum Amount2 = CALCULATE(SUM(t3Sales[Amount]),t3Sales[ProductID])
which internally is rewritten as:
Sum Amount2 = CALCULATE ( SUM ( t3Sales[Amount] ), FILTER ( ALL ( t3Sales[ProductID] ), t3Sales[ProductID] ) )
So, even though you put t3Sales[ProductID] on the rows, since you also referenced that column as a filter argument to CALCULATE, the table row filter is overridden by the above FILTER statement which returns all ProductID for t3Sales for every row which is why you get the grand total amount for every row in table. Difference from Sum Amount is now there is no filter on the t3ProductName[ProductID].
Make sense?
"Context Transition" refers to moving a column filter from row contex to filter context.
"Modify filter context" means just that. When you have columns on the rows, columns, filter area, slicers of report that is the initial filter context. Either row context transition or arguments to 'CALCULATE' may change ( or modify) that filter context.
Hi @gvg,
Based on my test, I found that if we add t3Sales[ProductID] into table visual, and for measure [SumAmount1], we refer to its related table column t3ProductName[ProductID], it will calculate total values grouped by ProductID. However, for measure [SumAmount2], we refer to its own column t3Sales[ProductID], the formula is considered as SumAmount2 = CALCULATE(SUM(t3Sales[Amount]),ALLSELECTED(t3Sales[ProductID])), means regardless of t3Sales[ProductID]. So, in your scenario, you should modify the formula for SumAmount2 to:
SumAmount2 = SUM(t3Sales[Amount])
Best regards,
Yuliana Gu
Well, I have t3Sales[ProductID] in both tables, that I am putting to test. However interestingly, if I change t3Sales[ProductID] to t3ProductName[ProductID] in the last visual, I get correct results. Getting totally confused 😞
Here is my test pbix file Expanded table test .
Your confusion lies in the fact that even though an expanded table is a series of left-outer joins of tables from many -> 1, the expanded table still has all the columns of all the tables. And tables are a collection of independent columns. Meaning that a filter on one column does not affect another until after applied to the model. (and even then the impact if from cross filtering rows) Take this code here:
Sum Amount = CALCULATE(SUM(t3Sales[Amount]),t3ProductName[ProductID])
internally, it is written as this:
Sum Amount = CALCULATE ( SUM ( t3Sales[Amount] ), FILTER ( ALL ( t3ProductName[ProductID] ), t3ProductName[ProductID] ) )
So it will return all the ProductID's from the t3ProductName column. But the table also has a filter on t3Sales[ProductID] because you put it on the rows which when combined filters down the t3Sales table to just where t3Sales[ProductID] = 100 for row 1, and so on for rest of the table. Grand total has no filter on t3Sales[ProductID] and only has the above filter on tsProductName[ProductID] column which returns all values, so you get the correct grand total.
next is your Sum Amount2
Sum Amount2 = CALCULATE(SUM(t3Sales[Amount]),t3Sales[ProductID])
which internally is rewritten as:
Sum Amount2 = CALCULATE ( SUM ( t3Sales[Amount] ), FILTER ( ALL ( t3Sales[ProductID] ), t3Sales[ProductID] ) )
So, even though you put t3Sales[ProductID] on the rows, since you also referenced that column as a filter argument to CALCULATE, the table row filter is overridden by the above FILTER statement which returns all ProductID for t3Sales for every row which is why you get the grand total amount for every row in table. Difference from Sum Amount is now there is no filter on the t3ProductName[ProductID].
Make sense?
Yes, this makes sense. Thanks!
Well, I am still a little confused. Let's break this all down in small steps for Sum Amount2 for the first value. My understanding is that before CALCULATE starts its job
1. The expanded table is filtered on t3Sales{ProductID]=100 since this field is in the visual. Thus our expanded table looks like this:
2. CALCULATE starts evaluating its filter. This FILTER function sees the table shown in step 1.
3. Since FILTER includes ALL as per your description, the expanded table gets expanded to the original like this:
Or does it expand to this only:
4. CALCULATE begins interpreting the filtered expanded table, which in fact is the whole (?) table.
5. CALCULATE makes context transition, i.e. filters the expanded table with t3Sales[ProductID]=100 like this:
6. Sums the Amount. Shouldn't it arrive at 50?
First, in your actual data in t3Sales, there are two rows with ProductID = 300 & Amount = 50 which is where the total value of 215 comes from. You don't show it in your screenshots.
Second, the order of your logic is not quite correct. CALCULATE does the following:
This means the filter context is constructed prior to being applied to the data tables - not during. So the table itself doesn't expand and contract per your steps - the only thing that changes are what filters are on columns.
1. The filter context has: a filter on the column t3Sales[ProductID] of t3Sales[ProductID] = 100. (no impact to table yet)
2. CALCULATE triggers creation of new filter context by cloning existing which in this case is the filter in step 1.
3. CALCULATE evaluates its setfilter argument which is expanded to this:
FILTER ( ALL ( t3Sales[ProductID] ), t3Sales[ProductID] )
this FILTER iterates over all distinct values in the column, with boolean expression being true for all of them - and therefore returns all distinct values from column t3Sales[ProductID]. Also, since this is the same column as step 1, this new filter blocks(removes) step 1 and replaces it in the filter context with this new one (again, all distinct values for column t3Sales[ProductID].
4. CALCULATE evaluates the first argument 'SUM ( t3Sales[Amount] )' by applying context we ended up with in step 3 to the t3Sale table. Since after being applied filter context returns all rows, we end up with a value of 215.(and will for all other rows in table).
And I would like to point out a couple more things. First when putting t3Sales[ProductID] on the rows of the table, that is a filter context, not a row context. There are only two places to generate a row context. First with a calculated column there exists a row context. Second is programatically with an iterator function. Both are transitioned from row to filter context by use of CALCULATE or some other table function like FIRSTNONBLANK, LASTDATE, etc.
But again putting something on rows, columns, page/report filter area are NOT row contexts. They are filter contexts.
Any clearer? Clear as mud?
Thanks for explanation @mattbrice. I believe I am coming to understanding but not there yet.
"... in your actual data in t3Sales, there are two rows with ProductID = 300 & Amount = 50"
What do you mean by that? I see only one row with ProductID = 300 & Amount = 50. Another row is with ProductID = 300 & Amount = 40.
" You don't show it in your screenshots"
They are here in step 3, aren't they:
Now, does CALCULATE iterate t3Sales or the expanded table?
Here is a screenshot of the acutal t3Sales table in your file I downloaded. It has two rows for ProductID = 300 & Amount = 50. Add up 'Amount' column and you get 215
@gvg wrote:Now, does CALCULATE iterate t3Sales or the expanded table?
To be precise, CALCULATE doesn't iterate anything. CALCULATE generates the filter context under which the first parameter expression is evaluated. That's it. What is filter context? It is a set of column filters like '[ProductID] = 300',
'[Amount] > 40'.
There is only one t3Sales and it is always an expanded table. It is t3Sales LOJ t3ProductName. So if you put a filter on a column in t3ProductName[ProductID], you are actually putting a filter on 't3ProductName[ProductID]' in the t3Sales expanded table because it includes all the colums from t3ProductName. Likewise, if you put the t3ProductName[ProductID] on rows of visual, yet measure is:
CALCULATE( SUM ( t3Sales[Amount] ), ALL ( t3Sales ) )
the ALL will block all column filters for the entire expanded table, including the column put on the visual rows, and return 215.
ok?
OK now. Thanks! Up till now I had wrong understanding that row context overrides Filter, specified as a second parameter in CALCULATE. It looks like row context transition takes place before CALCULATE's FILTER takes effect.
Well.....
Terminology matters, and in this particular example no row context transition ever takes place so it doesn't apply.
But generally speaking, like i listed before CALCULATE triggers this 4 step process:
So yes, step 2 happens before step 3 and therefore step 2 may be overidden by step 3. And when you do have row context transition, it only ( potentially ) impacts the first argument to CALCULATE.
I thought step 2 is context transition? If not, what context transition is?
I posted similar question on sqlbi.com and this is what Marco Russo replied:
https://www.sqlbi.com/articles/understanding-context-transition/
Getting even more confused, what context transition is 😞 ...
To be clear: If you put a value on the "Rows" section of a Matrix visual, that is a Filter context, not Row context. Again, row context exists in only two places: In a data table when adding calculated column & programatically by using an iterator function. But it takes another function to trigger transition from row context to filter context.
Here is a simple one column table i did in Excel:
with two calculated columns I wrote:
Max Date = MAX ( Calendar[Date] ) Calculate Max Date = CALCULATE ( MAX ( Calendar[Date] ) )
Row context exists in a data table, but only filter context impacts results. in '[Max Date]' row context for each row value computed is the row's respective date, but since no context transition, filter context is empty and so calculated column returns the max value for entire column which is 1/4/2017.
In '[Calculate Max Date]' we wrap it in a CALCUATE. CALCULATE triggers context transition and removes row's respective date from row context, and adds to filter context. So when the MAX evaluates, there is only one value left which is current row's 'Date' which is what it returns.
For measures, FILTER is a very common iterator. What is difference between these two uses?
FILTER ( ALL ( Calendar ) , Calendar[Date] <= MAX (Calendar[Date] ) ) FILTER ( ALL ( Calendar ) , Calendar[Date] <= LASTDATE (Calendar[Date] ) )
FILTER setups up iteration of entire 'Calendar' table.
The first measure uses 'MAX', which will return the last date value visible in the current filter context which is most likely what you had intended. So row context exists, but no trigger to move it to filter context and there for FILTER will return subset of Calendar table.
But the second measure uses 'LASTDATE' which is a table function that does trigger context transition. So for each row being iterated, LASTDATE will remove the respective date value from row context and add to filter context. Therefore when evalueated, LASTDATE will always return the currently iterated rows' value, which is turn will make the statement always true for entire Calendar table no matter what you have selected in a slicer (or some other means of selecting a date range). Probabaly not what was intended.
So that's row context and how it impacts results...clear? Or even more confusing?
I have read all of Marco & Alberto's articles. They are definitely the go to guys on Dax. And I read your questions in the article and of course Marco is 100% correct. I just think you misunderstood some of his answers.
@mattbrice, thank you very much for your extensive explanations. Will have to walk them through a few more times. Just a small question - is term "context transition" means the same as "modify filter context"?
"Context Transition" refers to moving a column filter from row contex to filter context.
"Modify filter context" means just that. When you have columns on the rows, columns, filter area, slicers of report that is the initial filter context. Either row context transition or arguments to 'CALCULATE' may change ( or modify) that filter context.
I think, the way you used the Sum formula is incorrect. Since, you have already created the relationship with Product ID, you can just drag a table visual then, select ProductID and Amount. Usually, power bi will do the sum based on Product id, if not, just select the Amount under Visualization-->Values then select the Small down arrow and then select Sum.
Sure, I can get away without using measures here. But I want to understand why CALCULATE makes different context transitions on the same expanded table.
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
110 | |
100 | |
80 | |
64 | |
58 |
User | Count |
---|---|
148 | |
111 | |
94 | |
84 | |
67 |