Solved: New Unique ID by concatenating two index columns

MJEnnis · ‎01-14-2022

I have a table that looks something like this (but the names are not present, just the IDs):

ID1	ID2	Person
53	1	Bob
5	31	Sally
888	62	Sheila
888	62	Sheila
999	50	Frank
999	77	Frank
14	99	Mike
102	88	Mike

ID1 and ID2 are both unique indexes that index the “persons” in other tables.

Some of the “persons” have multiple entries (like “Sheila”), others do not (like “Bob”). For other purposes, I need to create a new “distinct ID” to differentiate these multiple entries, as they relate to different events.

The obvious solution to me is a code like this:

Var TableNew = ADDCOLUMNS(TableOld,
"Distinct ID", CONCATENATE([ID1],[ID2])
)

However, on rare occasions, this code would give the same “distinct ID” to two different people, such as this:

ID1	ID2	Distinct ID	Person
53	1	531	Bob
5	31	531	Sally
888	62	88862	Sheila
888	62	88862	Sheila

Is there an easy way to specify a minimum length for ID1 and ID2, filling the missing slots with zeros, before concatenating? The ideal result would be something like the following:

ID1	ID2	Distinct ID	Person
53	1	53001	Bob
5	31	00531	Sally
888	62	88862	Sheila
888	62	88862	Sheila

Maybe there is a way to do this with RANKX()? But the resulting “distinct ID” needs to be reproducible in other tables that may not include all of the same “persons”. (Also I can not do this in Power Query at this stage…)

MJEnnis · ‎01-17-2022

This question was in spam limbo for a few days, and in the meantime, I figured it out. Will share my solution in case anyone else has the same problem.

CONCATENATE will work if at least one of the IDs is always of the same length. But it will not work if they both have variable lenths (as is my case).

RANKX will work to create a unique ID within one table. But if this unique ID has to be replicated in other already existing tables it will only work if all tables contain all combinations of the two IDs.

For this situation, I found this simple solution on the web:

Distinct ID = RANKX(TABLE, [longer ID] * 100000 + [shorter ID])

All you are doing in effect is adding five zeros to the end of the longer ID and then adding the shorter ID to the resulting number. The results are then ranked in descending order (or in ascending order, as needed) with no tie breakers, so that every instance of, for example, Sally with the IDs 5 and 31 gets the same unique identifer.

The problem is if one table does not have that combination, then the ranks get shifted accordingly and the IDs no longer match across tables.

The best solution I have found is to just drop the rank function and keep it to simple math.

Distinct ID = [longer ID] * 100000 + [shorter ID]

So you get something like this:

5300001 = Bob

500031 = Sally

This way the Distinct ID is always replicable, as long as ID1 and ID2 are unique indexes.

View solution in original post

MJEnnis · ‎01-17-2022

This question was in spam limbo for a few days, and in the meantime, I figured it out. Will share my solution in case anyone else has the same problem.

CONCATENATE will work if at least one of the IDs is always of the same length. But it will not work if they both have variable lenths (as is my case).

RANKX will work to create a unique ID within one table. But if this unique ID has to be replicated in other already existing tables it will only work if all tables contain all combinations of the two IDs.

For this situation, I found this simple solution on the web:

Distinct ID = RANKX(TABLE, [longer ID] * 100000 + [shorter ID])

All you are doing in effect is adding five zeros to the end of the longer ID and then adding the shorter ID to the resulting number. The results are then ranked in descending order (or in ascending order, as needed) with no tie breakers, so that every instance of, for example, Sally with the IDs 5 and 31 gets the same unique identifer.

The problem is if one table does not have that combination, then the ranks get shifted accordingly and the IDs no longer match across tables.

The best solution I have found is to just drop the rank function and keep it to simple math.

Distinct ID = [longer ID] * 100000 + [shorter ID]

So you get something like this:

5300001 = Bob

500031 = Sally

This way the Distinct ID is always replicable, as long as ID1 and ID2 are unique indexes.

MJEnnis · ‎01-17-2022

You just have to make sure that your multipler has enough zeros to contain the entire "shorter" ID. In my case, it is inconceivable that ID2 would ever be longer than 4 digits. To play it safe, I used 100000 as my multiplier.

New Unique ID by concatenating two index columns

Helpful resources

Microsoft Fabric Learn Together

Power BI Monthly Update - April 2024

Fabric Community Update - April 2024