Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi,
I have a MODEL where I have a Customers table and an Orders Table. I want to add a new mesure in the Customers table where i want to compute the average buying cyle. So i add this DAX mesure:
BUYING_CYCLE =
AVERAGEX(ORDERS;
DATEDIFF( CALCULATE (
MAX ( 'ORDERS'[ORDER_DATE] ) ;
FILTER(ORDERS; EARLIER('ORDERS'[ORDER_DATE]) >= 'ORDERS'[ORDER_DATE] )
); 'ORDERS'[ORDER_DATE]
; DAY )
)
This works fine but it's not correct. I should add a second filter like this:
FILTER(ORDERS; EARLIER('ORDERS'[ORDER_DATE]) >= 'ORDERS'[ORDER_DATE] ) && EARLIER('ORDERS'[ID]) <> 'ORDERS'[ID] )
But if I add this second expression the query is really slow and the memory gets to 98 plus %. There aren't a lot of records in the Orders table, like 25000.
Can anybody tell me how can i understand what is happening here?
Thank you.
Solved! Go to Solution.
The performance issue you're experiencing arises from the use of the EARLIER function, especially within a FILTER function that processes row-by-row. When you add a second filter condition using EARLIER, the complexity increases, causing the measure to consume more memory and time.
The EARLIER function is often misunderstood and is a common source of inefficiencies in DAX. When used within a row context, it returns the value of a specified column for the "earlier" row context (outer row context). This can lead to nested iterations and exponential growth in the amount of computation.
Here's a breakdown of your problem:
DATEDIFF calculates the difference between the maximum order date and the current row's order date.
The FILTER function then iterates through all rows to get the maximum order date for orders that occurred before or on the current row's order date.
Adding the second condition makes this calculation more complex, as it now has to compare the ID column for each row as well.
To optimize your DAX calculation:
Use Variables: Variables can reduce the number of times a calculation is computed, especially inside a row context.
Revise Logic: Instead of comparing every row with every other row, try to partition the data or use other DAX functions that might be more efficient.
Here's a revised version using variables:
BUYING_CYCLE =
VAR CurrentOrderDate = 'ORDERS'[ORDER_DATE]
VAR CurrentOrderID = 'ORDERS'[ID]
VAR PreviousOrderDate =
CALCULATE(
MAX('ORDERS'[ORDER_DATE]),
FILTER(
ALL('ORDERS'),
'ORDERS'[ORDER_DATE] < CurrentOrderDate && 'ORDERS'[ID] <> CurrentOrderID
)
)
RETURN
IF(
ISBLANK(PreviousOrderDate),
BLANK(),
DATEDIFF(PreviousOrderDate, CurrentOrderDate, DAY)
)
The performance issue you're experiencing arises from the use of the EARLIER function, especially within a FILTER function that processes row-by-row. When you add a second filter condition using EARLIER, the complexity increases, causing the measure to consume more memory and time.
The EARLIER function is often misunderstood and is a common source of inefficiencies in DAX. When used within a row context, it returns the value of a specified column for the "earlier" row context (outer row context). This can lead to nested iterations and exponential growth in the amount of computation.
Here's a breakdown of your problem:
DATEDIFF calculates the difference between the maximum order date and the current row's order date.
The FILTER function then iterates through all rows to get the maximum order date for orders that occurred before or on the current row's order date.
Adding the second condition makes this calculation more complex, as it now has to compare the ID column for each row as well.
To optimize your DAX calculation:
Use Variables: Variables can reduce the number of times a calculation is computed, especially inside a row context.
Revise Logic: Instead of comparing every row with every other row, try to partition the data or use other DAX functions that might be more efficient.
Here's a revised version using variables:
BUYING_CYCLE =
VAR CurrentOrderDate = 'ORDERS'[ORDER_DATE]
VAR CurrentOrderID = 'ORDERS'[ID]
VAR PreviousOrderDate =
CALCULATE(
MAX('ORDERS'[ORDER_DATE]),
FILTER(
ALL('ORDERS'),
'ORDERS'[ORDER_DATE] < CurrentOrderDate && 'ORDERS'[ID] <> CurrentOrderID
)
)
RETURN
IF(
ISBLANK(PreviousOrderDate),
BLANK(),
DATEDIFF(PreviousOrderDate, CurrentOrderDate, DAY)
)
@anagnostes I had a similar issue and my table was around the same size of rows.
The problem is with how many columns your table has. You need to try to shape it down and minimize the number of columns or increase the RAM on the pc running the PBI desktop. I upgraded to 16GB Ram and that helped very well.
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
41 | |
21 | |
21 | |
14 | |
13 |
User | Count |
---|---|
43 | |
36 | |
33 | |
18 | |
18 |