Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
MJEnnis
Helper V
Helper V

Help adding a column that looks up the earliest event in another table

Can anyone see what I am doing wrong here?

 

I need to add a column in Table 1 that looks up the earliest occurence of an event in Table 2.

 

Here are simplified versions of the tables:

 

Table 1

ID1ID2
10018842
10025435
10038842
10044025

 

Table 2

ID2LABELDATE
8842Red11/11/2018
5435Blue02/05/2017
8842Blue01/25/2020
4025Orange06/22/2019
5435Red05/30/2020

 

The resulting table should look something like this:

 

ID1ID2LABEL
10018842Red
10025435Blue
10038842Red
10044025Orange

 

I have tried something this, to no avail:

Column = CALCULATE(VALUES('Table 2'[LABEL]), FILTER('Table 2','Table 2'[ID2] = 'Table 1'[ID2] && 'Table 2'[Date] = MINX(FILTER('Table 2','Table 2'[ID2] = Earlier('Table 2'[ID2])), 'Table 2'[Date])))

 

Thanks!!

 

2 ACCEPTED SOLUTIONS
VahidDM
Super User
Super User

Hi @MJEnnis 

 

Try this code to add a new column:

Lable = 
CALCULATE (
    MAX ( 'Table 2'[LABEL] ),
    FILTER (
        'Table 2',
        'Table 1'[ID2] = EARLIER ( 'Table 1'[ID2] )
            && 'Table 2'[DATE]
                = CALCULATE (
                    MIN ( 'Table 2'[DATE] ),
                    FILTER ( ALL ( 'Table 2' ), 'Table 2'[ID2] = EARLIER ( 'Table 1'[ID2] ) )
                )
    )
)

 

output:

VahidDM_0-1642633044586.png

 

 

 

 

If this post helps, please consider accepting it as the solution to help the other members find it more quickly.
Appreciate your Kudos!!
LinkedIn: 
www.linkedin.com/in/vahid-dm/

 

 

View solution in original post

Icey
Community Support
Community Support

Hi @MJEnnis ,

 


@MJEnnis wrote:

@Icey 

 

To be clearer, this is what has worked for me, which is similar to what you propose (although I cannot store the variables you propose, because Table1 is not a table in the model, but a table expression in the larger code itself: 

 

Label =
CALCULATE (
MAX ( 'Table 2'[LABEL] ),
FILTER ( 'Table 2', 'Table 2'[ID2] =  'Table 1'[ID2] )
&& 'Table 2'[DATE]
= CALCULATE (
MIN ( 'Table 2'[DATE] ),
FILTER ( ALL ( 'Table 2' ), 'Table 2'[ID2] = EARLIER ( 'Table 2'[ID2] ) )

My slight changes to the original proposal are in red. The original proposal kept timing out before it would load. So I could never really test it out. After making the changes in red, it seems to work. But I have over 30.000 rows, so I cannot confirm in every case. I just want to make sure that what I am doing works.


Based on your description and my experience and test, your expression will work.

 

 

Best Regards,

Icey

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

8 REPLIES 8
VahidDM
Super User
Super User

Hi @MJEnnis 

 

Try this code to add a new column:

Lable = 
CALCULATE (
    MAX ( 'Table 2'[LABEL] ),
    FILTER (
        'Table 2',
        'Table 1'[ID2] = EARLIER ( 'Table 1'[ID2] )
            && 'Table 2'[DATE]
                = CALCULATE (
                    MIN ( 'Table 2'[DATE] ),
                    FILTER ( ALL ( 'Table 2' ), 'Table 2'[ID2] = EARLIER ( 'Table 1'[ID2] ) )
                )
    )
)

 

output:

VahidDM_0-1642633044586.png

 

 

 

 

If this post helps, please consider accepting it as the solution to help the other members find it more quickly.
Appreciate your Kudos!!
LinkedIn: 
www.linkedin.com/in/vahid-dm/

 

 

@VahidDM

 

Thank you so much! 

 

My database is pretty large, and when I run this code, I get an insufficient memory error. However, I got it to load with a seemingly minor adjustment.

 

Before I trust the results with the adjustment, but also for my own understanding, I have a couple questions.

 

I had tried using CALCULATE(MIN(), FILTER()) instead of a MINX() function on my original code but it did not work. So it seems that the big difference between what I had originally tried and what you propose is that you have replaced VALUES() with MAX() in line 3. Why does MAX() work and not VALUES()? Just trying to understand the syntax there. 

 

Second, the timeout error happens when I use 

 

'Table 1'[ID2] = EARLIER('Table 1'[ID2])

 

But it does not happen when I use:

 

'Table 2'[ID2] = 'Table 1'[ID2]

 

In other words, when I use your proposal, but just keep the original code in line 6, it loads quite quickly and appears to still produce the desired column. Maybe I do not fully understand how the EARLIER() function works... Why do you propose to index this filter on the "earlier" occurence of ID2 in Table 1 (since the code is already filtering Table 2 for the earliest date)?

 

As said, I just want to make sure that the adaptation that is working for me is not giving me erroneous data and I want to understand the syntax, since I will need to recycle this code a couple times to finish my project.

 

Thanks a lot!!

Icey
Community Support
Community Support

Hi @MJEnnis ,

 


 

Why does MAX() work and not VALUES()?

 


VALUES() will return a table. Please check its definition:

  • When the input parameter is a column name, returns a one-column table that contains the distinct values from the specified column. Duplicate values are removed and only unique values are returned. A BLANK value can be added.
  • When the input parameter is a table name, returns the rows from the specified table. Duplicate rows are preserved. A BLANK row can be added.

In your scenario, the end result is a single value and it can work fine. But if the end result with two values, an error will occur. It is recommended to use MAX(), MIN(), FIRSTNONBLANK(), SELECTEDVALUE(), etc..

 

In addition, you can refer to this document to get more details: 

Use SELECTEDVALUE instead of VALUES in DAX - DAX | Microsoft Docs

 

Icey_0-1643007820796.png

 

 


 

'Table 1'[ID2] = EARLIER('Table 1'[ID2])

 


The EARLIER function is used to refer to the 'Table 1'[ID2] column value in the current row context. It can be improved by using a variable instead of the EARLIER function, such as:

 

 

VAR CurrentID2_ = 'Table 1'[ID2]

 

 

 

For best practices when using VAR, see Use variables to improve your DAX formulas.

 

In addition, for the FILTER() function, it should only be used when necessary. For best performance, it's recommended you use Boolean expressions as filter arguments, whenever possible. For more details, please refer:

Avoid using FILTER as a filter argument in DAX - DAX | Microsoft Docs

The FILTER Function in DAX - Excelerator BI

The FILTER Function in DAX Part 2 - Excelerator BI

 

In view of the above, please check if this expression can give better computing performance in your scenario.

 

 

Label 4 = 

//Store the current ID2.
VAR CurrentID2_ = 'Table 1'[ID2]

//Store the EarliestDate of the current ID2.
VAR EarliestDate_ =
    CALCULATE ( MIN ( 'Table 2'[DATE] ), 'Table 2'[ID2] = CurrentID2_ )

RETURN
//Return the corresponding LABEL of the EarliestDate_ of current ID2.
    CALCULATE (
        MAX ( 'Table 2'[LABEL] ),
        'Table 2'[DATE] = EarliestDate_,
        'Table 2'[ID2] = CurrentID2_  // Just in case there are the same EarliestDate_ among [ID2].
    )

 

 

 

 

 

Best Regards,

Icey

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Thanks, Icey, for the clarification on MAX() vs VALUE(). Also thanks for the recomendations on improving performance. Not all of them will work in my case, as this is all part of a much longer code for a calculated table. Cannot store VARs based on "table expressions" apparently. They have to be tables that already exist in the model, it seems. However, dropping the FILTER() commands could help, once the code is finished and just needs to be optimized.  My only remaining question: 

Will this

 

'Table 1'[ID2] = EARLIER('Table 1'[ID2])

 

return the same as this

 

'Table 2'[ID2] = 'Table 1'[ID2]

 

in this case?

 

Thanks again!

Icey
Community Support
Community Support

Hi @MJEnnis ,

 



My only remaining question: 

Will this

 

'Table 1'[ID2] = EARLIER('Table 1'[ID2])

 

return the same as this

 

'Table 2'[ID2] = 'Table 1'[ID2]

 

in this case?

 


The answer is NO. Please check:

 

min date 1.PNG

When you use 'Table 1'[ID2] = EARLIER('Table 1'[ID2]), the filtered table is "Table 1";

When you use 'Table 2'[ID2] = 'Table 1'[ID2], the filtered table is "Table 2".

 

 

Best Regards,

Icey

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Thank you, that is crystal clear. 

 

However, the original suggestion from another user was this: 

 

Lable = 
CALCULATE (
    MAX ( 'Table 2'[LABEL] ),
    FILTER (
        'Table 2',
        'Table 1'[ID2] = EARLIER ( 'Table 1'[ID2] )
            && 'Table 2'[DATE]
                = CALCULATE (
                    MIN ( 'Table 2'[DATE] ),
                    FILTER ( ALL ( 'Table 2' ), 'Table 2'[ID2] = EARLIER ( 'Table 1'[ID2] ) )

 

I could only get it to work by using 'Table 2'[ID2] = 'Table 1'[ID2] in line 6. I get that the two filters are doing two different things (though applied slightly differently than you have above). But I do not understand why the proposal worked for the other user and not for me. Whereas my slight tweak (which makes more sense to me) seems to work perfectly. 

@Icey 

 

To be clearer, this is what has worked for me, which is similar to what you propose (although I cannot store the variables you propose, because Table1 is not a table in the model, but a table expression in the larger code itself: 

 

Label =
CALCULATE (
MAX ( 'Table 2'[LABEL] ),
FILTER ( 'Table 2', 'Table 2'[ID2] =  'Table 1'[ID2] )
&& 'Table 2'[DATE]
= CALCULATE (
MIN ( 'Table 2'[DATE] ),
FILTER ( ALL ( 'Table 2' ), 'Table 2'[ID2] = EARLIER ( 'Table 2'[ID2] ) )

My slight changes to the original proposal are in red. The original proposal kept timing out before it would load. So I could never really test it out. After making the changes in red, it seems to work. But I have over 30.000 rows, so I cannot confirm in every case. I just want to make sure that what I am doing works.

Icey
Community Support
Community Support

Hi @MJEnnis ,

 


@MJEnnis wrote:

@Icey 

 

To be clearer, this is what has worked for me, which is similar to what you propose (although I cannot store the variables you propose, because Table1 is not a table in the model, but a table expression in the larger code itself: 

 

Label =
CALCULATE (
MAX ( 'Table 2'[LABEL] ),
FILTER ( 'Table 2', 'Table 2'[ID2] =  'Table 1'[ID2] )
&& 'Table 2'[DATE]
= CALCULATE (
MIN ( 'Table 2'[DATE] ),
FILTER ( ALL ( 'Table 2' ), 'Table 2'[ID2] = EARLIER ( 'Table 2'[ID2] ) )

My slight changes to the original proposal are in red. The original proposal kept timing out before it would load. So I could never really test it out. After making the changes in red, it seems to work. But I have over 30.000 rows, so I cannot confirm in every case. I just want to make sure that what I am doing works.


Based on your description and my experience and test, your expression will work.

 

 

Best Regards,

Icey

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors