Solved: Re: LOOKUPVALUE in range of dates

Anonymous · ‎08-20-2017

Dear all,

I have two tables:

TABLE_1, that has a list of all employees on each month (below, just an example with one employee_id)

EMPLOYEE_ID	MONTHLY PERIOD
2875	10/01/2016
2875	11/01/2016
2875	07/01/2017
2875	08/01/2017

TABLE_2 that has information for each employee of absenses (example: vacations, medical leave, pregnancy, etc).

EMPLOYEE_ID	START	END	TYPE
2875	10/15/2016	11/05/2016	120 (VACATIONS)
2875	07/20/2017	08/10/2017	290 (MEDICAL LEAVE)

I´m trying to add a calculated column in TABLE_1 so I would have, for each period, what was the condition of the employee.

The conditions would be:

"IF (TABLE_2[START] < ENDOFMONTH(TABLE_1[MONTHLY PERIOD])

&& TABLE_2[END] > ENDOFMONTH(TABLE_1[MONTHLY PERIOD]))

THEN TABLE_2[TYPE]" <-- of that specific line

ELSE "ACTIVE"

The TABLE_1 would result in the below:

EMPLOYEE_ID	MONTHLY PERIOD	TYPE
2875	10/01/2016	120 (VACATIONS)
2875	11/01/2016	ACTIVE
2875	07/01/2017	290 (MEDICAL LEAVE)
2875	08/01/2017	ACTIVE

Ideas on how to do it would be appreciated!

TomMartens · ‎08-22-2017

Hey,

here you will find pbix file and here the sample data

There is a new table "Sheet3" that represents the table with the uniqe "Employee_ID". From my understanding the relationships should look like this

There is a new measure in Sheet1

Measure 2 = 
IF(ISFILTERED(Sheet1[EMPLOYEE_ID])
,CALCULATE(
    CONCATENATEX(
            SELECTCOLUMNS('Sheet1', "EMPLOYEE_ID", 'Sheet1'[EMPLOYEE_ID], "MONTHLY_PERIOD", 'Sheet1'[MONTHLY PERIOD])
            ,var currentEmployee = [EMPLOYEE_ID] 
            var currentMonthlyPeriod = CALCULATE(EOMONTH(MAX('Sheet1'[MONTHLY PERIOD]),0)) 
            var empType = 
			CONCATENATEX('Sheet2'
                ,IF('Sheet2'[EMPLOYEE_ID] = currentEmployee 
			     && 'Sheet2'[START] < currentMonthlyPeriod 
			     && 'Sheet2'[END] > currentMonthlyPeriod
			     ,'Sheet2'[TYPE]
			     ,BLANK()
                )) 
			return
            IF(empType = "","ACTIVE",empType)
			) 
        )
)

and also a column in Sheet1, that is basically the same as the measure except the 1st check if Sheet1[Employee_ID] is filtered.

Dependig on the table size of your "Sheet1", I would recommend to use the column version if the table size is small (what ever that means) ;-), this is due to memory consumption of "physical / calculated columns".

Regards

Did I answer your question? Mark my post as a solution, this will help others!

Proud to be a Super User!
I accept Kudos 😉
Hamburg, Germany

View solution in original post

TomMartens · ‎08-22-2017

Hey @Anonymous,

if you have some time, you may redownload the pbix file. There are two new calculated columns: column 3 and column 4.

If you are sure that there is just one daterange in table2 for any given date from table1 try column4, else go with column3.

Column3 is a little optimized in comparison to the initial column whereas Column 4 assumes just one valid daterange for any given employee / period.

I have to admit that I'm somewhat intereested if there a performance gain using the new columns

Happy to help

Regards

Did I answer your question? Mark my post as a solution, this will help others!

Proud to be a Super User!
I accept Kudos 😉
Hamburg, Germany

View solution in original post

Anonymous · ‎08-22-2017

Tom,

I tested all of them. Column 4 is perfect for me, and it was extremely fast.

In my data set:

Column took 32 seconds

Column 2 took more than 3 minutes and I killed it

Column 3 took more than 3 minutes and I killed it

Column 4 took 3 seconds

I´m in a Dell M3800, i7 with 16gb RAM.

View solution in original post

Anonymous · ‎08-20-2017

It looks like what I´m trying to do is a "many to many relationship", isn´t it? I tried to read the articles on it and learn, but still having difficulties.

TomMartens · ‎08-21-2017

Hey,

using your sample data, I came up with this measure

Measure = 
IF(ISFILTERED('Sheet1'[EMPLOYEE_ID])
,IF(
CONCATENATEX(	'Sheet1'
	,var currentEmployee = 'Sheet1'[EMPLOYEE_ID]
	var currentDate = CALCULATE(EOMONTH(MAX('Sheet1'[MONTHLY PERIOD]),0))
	return
	CONCATENATEX('Sheet2'
		,IF('Sheet2'[EMPLOYEE_ID] = currentEmployee 
			&& 'Sheet2'[START] < currentDate 
			&& 'Sheet2'[END] > currentDate
			,'Sheet2'[TYPE]
			,BLANK()
		)
	)
)
<>""
,CONCATENATEX(	'Sheet1'
	,var currentEmployee = 'Sheet1'[EMPLOYEE_ID]
	var currentDate = CALCULATE(EOMONTH(MAX('Sheet1'[MONTHLY PERIOD]),0))
	return
	CONCATENATEX('Sheet2'
		,IF('Sheet2'[EMPLOYEE_ID] = currentEmployee 
			&& 'Sheet2'[START] < currentDate 
			&& 'Sheet2'[END] > currentDate
			,'Sheet2'[TYPE]
			,BLANK()
		)
	)
)
,"ACTIVE"
))

This measure creates this output

Hope this helps, currently I've been in a hurry so there may be potential for some perfomance tweeks.

Did I answer your question? Mark my post as a solution, this will help others!

Proud to be a Super User!
I accept Kudos 😉
Hamburg, Germany

Anonymous · ‎08-22-2017

Hi Tom,

The measure worked and it´s pretty close to the solution.

However, somehow my slicers are not filtering the result. I tried to add some "ALLEXCEPT" in the formula, but wasn´t able to get it right.

I´m trying to "slice it" through 'Sheet2'[EMPLOYEE_ID]

Thank you

TomMartens · ‎08-22-2017

Hey Eduardo,

I'm wondering how your datamodel looks like and if there is any relationship between your tables Table1 and Table2, may it direct or indirect (e.g. a table that relates to both tables - 'emplyoee'.

How do you try to achieve slice table1 by a column from table2 if both tables are not related, can you please mock a report and share an image with some annotations.

Wondering what do you expect, when a user selectcs 2 id's or none.

Regards

Did I answer your question? Mark my post as a solution, this will help others!

Proud to be a Super User!
I accept Kudos 😉
Hamburg, Germany

Anonymous · ‎08-22-2017

Tom,

You are correct on the indirect relationship. It looks like this:

TABLE_1 = SAP_HR-VW_Z_FOLHA and TABLE_2 = SAP_HR-Z_PA2001

The indirect relation is a table that contains distinct employee_ids, with all sort of information from the person (date of birth, address, gender, date of admission, date of dismissal, etc).

What I´m trying to do is "select a person" (a name, from this indirect relation), and have the employee_ID filtered on both TABLE_1 and TABLE_2.

If I try to slice through this indirect relation, see that both employee_ids (left and right) are filtered, but not the result of the measure below.If I slice through "TABLE_2[EMPLOYEE_ID" nothing happens.Finally, If I slice it through "TABLE_1[EMPLOYEE_ID]" I get the expected result.

Anonymous · ‎08-22-2017

Could this be a calculated column instead of a measure?

TomMartens · ‎08-22-2017

Hey,

here you will find pbix file and here the sample data

There is a new table "Sheet3" that represents the table with the uniqe "Employee_ID". From my understanding the relationships should look like this

There is a new measure in Sheet1

Measure 2 = 
IF(ISFILTERED(Sheet1[EMPLOYEE_ID])
,CALCULATE(
    CONCATENATEX(
            SELECTCOLUMNS('Sheet1', "EMPLOYEE_ID", 'Sheet1'[EMPLOYEE_ID], "MONTHLY_PERIOD", 'Sheet1'[MONTHLY PERIOD])
            ,var currentEmployee = [EMPLOYEE_ID] 
            var currentMonthlyPeriod = CALCULATE(EOMONTH(MAX('Sheet1'[MONTHLY PERIOD]),0)) 
            var empType = 
			CONCATENATEX('Sheet2'
                ,IF('Sheet2'[EMPLOYEE_ID] = currentEmployee 
			     && 'Sheet2'[START] < currentMonthlyPeriod 
			     && 'Sheet2'[END] > currentMonthlyPeriod
			     ,'Sheet2'[TYPE]
			     ,BLANK()
                )) 
			return
            IF(empType = "","ACTIVE",empType)
			) 
        )
)

and also a column in Sheet1, that is basically the same as the measure except the 1st check if Sheet1[Employee_ID] is filtered.

Dependig on the table size of your "Sheet1", I would recommend to use the column version if the table size is small (what ever that means) ;-), this is due to memory consumption of "physical / calculated columns".

Regards

Did I answer your question? Mark my post as a solution, this will help others!

Proud to be a Super User!
I accept Kudos 😉
Hamburg, Germany

Anonymous · ‎08-22-2017

Dear Tom,

Thank you very much for your time. It worked.

I had to go with the measure 2. The column took more than 15 minutes to update , and it didn´t! I had to kill the PBI process (and I do have a nice computer). The data is not that big, but I guess the number of iteractions are: in "TABLE_1" I have 186,822 rows, and "TABLE_2" 108,576 rows.

Best regards,

Eduardo

Anonymous · ‎08-22-2017

Actually when I killed the process, reopened, and try to add the column again, it worked after about a minute.

So, solution fully accepted!

Best regards,

Eduardo

TomMartens · ‎08-22-2017

Hey @Anonymous,

if you have some time, you may redownload the pbix file. There are two new calculated columns: column 3 and column 4.

If you are sure that there is just one daterange in table2 for any given date from table1 try column4, else go with column3.

Column3 is a little optimized in comparison to the initial column whereas Column 4 assumes just one valid daterange for any given employee / period.

I have to admit that I'm somewhat intereested if there a performance gain using the new columns

Happy to help

Regards

Did I answer your question? Mark my post as a solution, this will help others!

Proud to be a Super User!
I accept Kudos 😉
Hamburg, Germany

Anonymous · ‎08-22-2017

Tom,

I tested all of them. Column 4 is perfect for me, and it was extremely fast.

In my data set:

Column took 32 seconds

Column 2 took more than 3 minutes and I killed it

Column 3 took more than 3 minutes and I killed it

Column 4 took 3 seconds

I´m in a Dell M3800, i7 with 16gb RAM.

TomMartens · ‎08-22-2017

Glad it works and thanks for sharing the details!

Did I answer your question? Mark my post as a solution, this will help others!

Proud to be a Super User!
I accept Kudos 😉
Hamburg, Germany

LOOKUPVALUE in range of dates

Helpful resources

Microsoft Fabric Learn Together

Power BI Monthly Update - April 2024

Fabric Community Update - April 2024

How to Get Your Question Answered Quickly