Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
markus_zhang
Advocate III
Advocate III

Improve understanding of the following DAX script

Hi experts,

Here is part of the "Lost Customers" measure I'm reading at:

https://www.daxpatterns.com/new-and-returning-customers/

 

Part of the code:

 

CALCULATETABLE (
                    ADDCOLUMNS (
                        VALUES ( <customer_key_column> ),
                        "CustomerLostDate", 
                            CALCULATE ( MAX ( <fact_date_column> ) ) + [LostDaysLimit]
                    ),
                    FILTER (
                        ALL ( <date_table> ),
                        AND (
                            <date_column> < MIN ( <date_column> ),
                            <date_column>
                                >= MIN ( <date_column> ) - [LostDaysLimit]
                        )
                    )
                )

 

 

My understanding of the code is:

1) The inner FILTER() is interpreted first, so we are looking at unfiltered <data_table>, and then extract those records that satisfies the big AND(), which means we only need dates that between the two MIN(). This is easy.

 

2) After the FILTER() has been interpreted, it is applied on the ADDCOLUMNS():

 

ADDCOLUMNS (
                        VALUES ( <customer_key_column> ),
                        "CustomerLostDate", 
                            CALCULATE ( MAX ( <fact_date_column> ) ) + [LostDaysLimit]
                    ),

Here is what I get confused. Since we already have a filter context from the Pivot Table, and we also have a filter context from FILTER(), I browsed through "Definite Guide to DAX" and figured that FILTER() actually enhanced current filter context, so this means, assuming that on the Pivot Table we are looking at 2006 Jan, so VALUES(<customer_key_column>) will only look at customers who made purchase in 2006 Jan, but FILTER comes in and only look at MIN(<date_coumn>) and MIN(<date_coumn>) - [LostDaysLimit]. Because <date_column> is in MIN I know that this is still restricted by the Pivot Table, so it means the FILTER() is only looking at some date range before 2006 Jan.

 

So the question is, how does FILTER() affect ADDCOLUMNS()? Assuming [LostDaysLimit] = 30, then FILTER gives us all records from 2005 Dec-2 to 2005 Dec-31, but the Pivot Table filter only looks at 2006 Jan, which one should VALUES() look at? I'm really confused about this filter maze. I'm wondering if it is easier to write something similar in SQL.

 

The thing is, there is noway for me to test my understandings. I can't actually bring the pivot table into Dax Studio and write queries against it. The pivot table more and more seem to like a curse than a blessing, as most of the difficult thing comes from it. I really would rather that we have no pivot table and (like SQL) has to build one using code.

2 ACCEPTED SOLUTIONS
AlexisOlson
Super User
Super User

You are correct that when you have the following construction

 

CALCULATE(<...>, FILTER(<table>, ...))

that the table argument of the FILTER function is taken within the local filter context.

 

However, the table in the first argument in the code you have is wrapped in an ALL function. This removes all filtering. So the filter context now starts from scratch and you just have the two conditions that are explicitly given within the AND.

 

From the documentation of the CALCULATE function,

"If the data has been filtered, the CALCULATE function changes the context in which the data is filtered, and evaluates the expression in the new context that you specify. For each column used in a filter argument, any existing filters on that column are removed, and the filter used in the filter argument is applied instead."

 

So what's happing is that your newly defined filter context is replacing your local filter context rather than adding to it, so the ADDCOLUMNS part is no longer restricted to just looking at 2006 Jan.

View solution in original post

The VALUES(<customer_key_column>) will be evaluated within its local filter context, except for any context related to date that has been replaced by the FILTER argument. For example, if the customer has something like geographic region in its filter context, this will not be removed. The filter propagates only along the relationships, so ALL(<date_table>) only removes date filters from the related fact table, not other dimensions.

 

Same story as above. The MAX(<fact_date_column>) will be evaluated within the filter context modified by the FILTER argument. That is, it will pick the maximal date from <fact_date_column> that exists within the date range specified in the FILTER function (with any non-date filter context still applied).

 

I've never heard that MIN and MAX negate ALL. That doesn't sound right to me.

View solution in original post

4 REPLIES 4
AlexisOlson
Super User
Super User

You are correct that when you have the following construction

 

CALCULATE(<...>, FILTER(<table>, ...))

that the table argument of the FILTER function is taken within the local filter context.

 

However, the table in the first argument in the code you have is wrapped in an ALL function. This removes all filtering. So the filter context now starts from scratch and you just have the two conditions that are explicitly given within the AND.

 

From the documentation of the CALCULATE function,

"If the data has been filtered, the CALCULATE function changes the context in which the data is filtered, and evaluates the expression in the new context that you specify. For each column used in a filter argument, any existing filters on that column are removed, and the filter used in the filter argument is applied instead."

 

So what's happing is that your newly defined filter context is replacing your local filter context rather than adding to it, so the ADDCOLUMNS part is no longer restricted to just looking at 2006 Jan.

Sorry, just one more thing, in ADDCOLUMNS(), VALUES ( <customer_key_column> ) is not restricted by the filter context right? Because ALL() clears all filters placed on the Date table, and by filter propagation, this also applies to the Sales table, so VALUES() should look at all customers who have made any purchase in any date.

 

However, the MAX(<fact_date_column>) should still look at the current filter context introduced by the Pivot Table, right? Because in another post someone else told me that functions like MIN(), MAX(), etc. will negate the ALL(). My head is spinning...

 

Is my understanding correct? Thanks in advance.

The VALUES(<customer_key_column>) will be evaluated within its local filter context, except for any context related to date that has been replaced by the FILTER argument. For example, if the customer has something like geographic region in its filter context, this will not be removed. The filter propagates only along the relationships, so ALL(<date_table>) only removes date filters from the related fact table, not other dimensions.

 

Same story as above. The MAX(<fact_date_column>) will be evaluated within the filter context modified by the FILTER argument. That is, it will pick the maximal date from <fact_date_column> that exists within the date range specified in the FILTER function (with any non-date filter context still applied).

 

I've never heard that MIN and MAX negate ALL. That doesn't sound right to me.

Ah, thanks, seems I mixed up CALCULATE() and FILTER() somehow. I re-read Marco's book and indeed CALCULATE() "creates a new filter context and then evaluates an expression in that new context".

 

So FILTER further restricts and CALCULATE  creates new context, got it.

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.