Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Can anyone see what I am doing wrong here?
I need to add a column in Table 1 that looks up the earliest occurence of an event in Table 2.
Here are simplified versions of the tables:
Table 1
ID1 | ID2 |
1001 | 8842 |
1002 | 5435 |
1003 | 8842 |
1004 | 4025 |
Table 2
ID2 | LABEL | DATE |
8842 | Red | 11/11/2018 |
5435 | Blue | 02/05/2017 |
8842 | Blue | 01/25/2020 |
4025 | Orange | 06/22/2019 |
5435 | Red | 05/30/2020 |
The resulting table should look something like this:
ID1 | ID2 | LABEL |
1001 | 8842 | Red |
1002 | 5435 | Blue |
1003 | 8842 | Red |
1004 | 4025 | Orange |
I have tried something this, to no avail:
Column = CALCULATE(VALUES('Table 2'[LABEL]), FILTER('Table 2','Table 2'[ID2] = 'Table 1'[ID2] && 'Table 2'[Date] = MINX(FILTER('Table 2','Table 2'[ID2] = Earlier('Table 2'[ID2])), 'Table 2'[Date])))
Thanks!!
Solved! Go to Solution.
Hi @MJEnnis
Try this code to add a new column:
Lable =
CALCULATE (
MAX ( 'Table 2'[LABEL] ),
FILTER (
'Table 2',
'Table 1'[ID2] = EARLIER ( 'Table 1'[ID2] )
&& 'Table 2'[DATE]
= CALCULATE (
MIN ( 'Table 2'[DATE] ),
FILTER ( ALL ( 'Table 2' ), 'Table 2'[ID2] = EARLIER ( 'Table 1'[ID2] ) )
)
)
)
output:
If this post helps, please consider accepting it as the solution to help the other members find it more quickly.
Appreciate your Kudos!!
LinkedIn: www.linkedin.com/in/vahid-dm/
Hi @MJEnnis ,
@MJEnnis wrote:
To be clearer, this is what has worked for me, which is similar to what you propose (although I cannot store the variables you propose, because Table1 is not a table in the model, but a table expression in the larger code itself:
Label =
CALCULATE (
MAX ( 'Table 2'[LABEL] ),
FILTER ( 'Table 2', 'Table 2'[ID2] = 'Table 1'[ID2] )
&& 'Table 2'[DATE]
= CALCULATE (
MIN ( 'Table 2'[DATE] ),
FILTER ( ALL ( 'Table 2' ), 'Table 2'[ID2] = EARLIER ( 'Table 2'[ID2] ) )My slight changes to the original proposal are in red. The original proposal kept timing out before it would load. So I could never really test it out. After making the changes in red, it seems to work. But I have over 30.000 rows, so I cannot confirm in every case. I just want to make sure that what I am doing works.
Based on your description and my experience and test, your expression will work.
Best Regards,
Icey
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Hi @MJEnnis
Try this code to add a new column:
Lable =
CALCULATE (
MAX ( 'Table 2'[LABEL] ),
FILTER (
'Table 2',
'Table 1'[ID2] = EARLIER ( 'Table 1'[ID2] )
&& 'Table 2'[DATE]
= CALCULATE (
MIN ( 'Table 2'[DATE] ),
FILTER ( ALL ( 'Table 2' ), 'Table 2'[ID2] = EARLIER ( 'Table 1'[ID2] ) )
)
)
)
output:
If this post helps, please consider accepting it as the solution to help the other members find it more quickly.
Appreciate your Kudos!!
LinkedIn: www.linkedin.com/in/vahid-dm/
Thank you so much!
My database is pretty large, and when I run this code, I get an insufficient memory error. However, I got it to load with a seemingly minor adjustment.
Before I trust the results with the adjustment, but also for my own understanding, I have a couple questions.
I had tried using CALCULATE(MIN(), FILTER()) instead of a MINX() function on my original code but it did not work. So it seems that the big difference between what I had originally tried and what you propose is that you have replaced VALUES() with MAX() in line 3. Why does MAX() work and not VALUES()? Just trying to understand the syntax there.
Second, the timeout error happens when I use
'Table 1'[ID2] = EARLIER('Table 1'[ID2])
But it does not happen when I use:
'Table 2'[ID2] = 'Table 1'[ID2]
In other words, when I use your proposal, but just keep the original code in line 6, it loads quite quickly and appears to still produce the desired column. Maybe I do not fully understand how the EARLIER() function works... Why do you propose to index this filter on the "earlier" occurence of ID2 in Table 1 (since the code is already filtering Table 2 for the earliest date)?
As said, I just want to make sure that the adaptation that is working for me is not giving me erroneous data and I want to understand the syntax, since I will need to recycle this code a couple times to finish my project.
Thanks a lot!!
Hi @MJEnnis ,
Why does MAX() work and not VALUES()?
VALUES() will return a table. Please check its definition:
In your scenario, the end result is a single value and it can work fine. But if the end result with two values, an error will occur. It is recommended to use MAX(), MIN(), FIRSTNONBLANK(), SELECTEDVALUE(), etc..
In addition, you can refer to this document to get more details:
Use SELECTEDVALUE instead of VALUES in DAX - DAX | Microsoft Docs
'Table 1'[ID2] = EARLIER('Table 1'[ID2])
The EARLIER function is used to refer to the 'Table 1'[ID2] column value in the current row context. It can be improved by using a variable instead of the EARLIER function, such as:
VAR CurrentID2_ = 'Table 1'[ID2]
For best practices when using VAR, see Use variables to improve your DAX formulas.
In addition, for the FILTER() function, it should only be used when necessary. For best performance, it's recommended you use Boolean expressions as filter arguments, whenever possible. For more details, please refer:
Avoid using FILTER as a filter argument in DAX - DAX | Microsoft Docs
The FILTER Function in DAX - Excelerator BI
The FILTER Function in DAX Part 2 - Excelerator BI
In view of the above, please check if this expression can give better computing performance in your scenario.
Label 4 =
//Store the current ID2.
VAR CurrentID2_ = 'Table 1'[ID2]
//Store the EarliestDate of the current ID2.
VAR EarliestDate_ =
CALCULATE ( MIN ( 'Table 2'[DATE] ), 'Table 2'[ID2] = CurrentID2_ )
RETURN
//Return the corresponding LABEL of the EarliestDate_ of current ID2.
CALCULATE (
MAX ( 'Table 2'[LABEL] ),
'Table 2'[DATE] = EarliestDate_,
'Table 2'[ID2] = CurrentID2_ // Just in case there are the same EarliestDate_ among [ID2].
)
Best Regards,
Icey
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Thanks, Icey, for the clarification on MAX() vs VALUE(). Also thanks for the recomendations on improving performance. Not all of them will work in my case, as this is all part of a much longer code for a calculated table. Cannot store VARs based on "table expressions" apparently. They have to be tables that already exist in the model, it seems. However, dropping the FILTER() commands could help, once the code is finished and just needs to be optimized. My only remaining question:
Will this
'Table 1'[ID2] = EARLIER('Table 1'[ID2])
return the same as this
'Table 2'[ID2] = 'Table 1'[ID2]
in this case?
Thanks again!
Hi @MJEnnis ,
My only remaining question:
Will this
'Table 1'[ID2] = EARLIER('Table 1'[ID2])
return the same as this
'Table 2'[ID2] = 'Table 1'[ID2]
in this case?
The answer is NO. Please check:
When you use 'Table 1'[ID2] = EARLIER('Table 1'[ID2]), the filtered table is "Table 1";
When you use 'Table 2'[ID2] = 'Table 1'[ID2], the filtered table is "Table 2".
Best Regards,
Icey
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Thank you, that is crystal clear.
However, the original suggestion from another user was this:
Lable = CALCULATE ( MAX ( 'Table 2'[LABEL] ), FILTER ( 'Table 2', 'Table 1'[ID2] = EARLIER ( 'Table 1'[ID2] ) && 'Table 2'[DATE] = CALCULATE ( MIN ( 'Table 2'[DATE] ), FILTER ( ALL ( 'Table 2' ), 'Table 2'[ID2] = EARLIER ( 'Table 1'[ID2] ) )
I could only get it to work by using 'Table 2'[ID2] = 'Table 1'[ID2] in line 6. I get that the two filters are doing two different things (though applied slightly differently than you have above). But I do not understand why the proposal worked for the other user and not for me. Whereas my slight tweak (which makes more sense to me) seems to work perfectly.
To be clearer, this is what has worked for me, which is similar to what you propose (although I cannot store the variables you propose, because Table1 is not a table in the model, but a table expression in the larger code itself:
Label =
CALCULATE (
MAX ( 'Table 2'[LABEL] ),
FILTER ( 'Table 2', 'Table 2'[ID2] = 'Table 1'[ID2] )
&& 'Table 2'[DATE]
= CALCULATE (
MIN ( 'Table 2'[DATE] ),
FILTER ( ALL ( 'Table 2' ), 'Table 2'[ID2] = EARLIER ( 'Table 2'[ID2] ) )
My slight changes to the original proposal are in red. The original proposal kept timing out before it would load. So I could never really test it out. After making the changes in red, it seems to work. But I have over 30.000 rows, so I cannot confirm in every case. I just want to make sure that what I am doing works.
Hi @MJEnnis ,
@MJEnnis wrote:
To be clearer, this is what has worked for me, which is similar to what you propose (although I cannot store the variables you propose, because Table1 is not a table in the model, but a table expression in the larger code itself:
Label =
CALCULATE (
MAX ( 'Table 2'[LABEL] ),
FILTER ( 'Table 2', 'Table 2'[ID2] = 'Table 1'[ID2] )
&& 'Table 2'[DATE]
= CALCULATE (
MIN ( 'Table 2'[DATE] ),
FILTER ( ALL ( 'Table 2' ), 'Table 2'[ID2] = EARLIER ( 'Table 2'[ID2] ) )My slight changes to the original proposal are in red. The original proposal kept timing out before it would load. So I could never really test it out. After making the changes in red, it seems to work. But I have over 30.000 rows, so I cannot confirm in every case. I just want to make sure that what I am doing works.
Based on your description and my experience and test, your expression will work.
Best Regards,
Icey
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
47 | |
26 | |
19 | |
14 | |
10 |
User | Count |
---|---|
58 | |
50 | |
44 | |
19 | |
18 |