Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
MJEnnis
Helper V
Helper V

New Unique ID by concatenating two index columns

I have a table that looks something like this (but the names are not present, just the IDs):

 

ID1ID2Person
531Bob
531Sally
88862Sheila
88862Sheila
99950Frank
99977Frank
1499Mike
10288Mike

 

ID1 and ID2 are both unique indexes that index the “persons” in other tables.

 

Some of the “persons” have multiple entries (like “Sheila”), others do not (like “Bob”). For other purposes, I need to create a new “distinct ID” to differentiate these multiple entries, as they relate to different events.

 

The obvious solution to me is a code like this:

 

Var TableNew = ADDCOLUMNS(TableOld,
"Distinct ID", CONCATENATE([ID1],[ID2])
)

 

However, on rare occasions, this code would give the same “distinct ID” to two different people, such as this:

 

ID1ID2Distinct IDPerson
531531Bob
531531Sally
8886288862Sheila
8886288862Sheila

 

 

Is there an easy way to specify a minimum length for ID1 and ID2, filling the missing slots with zeros, before concatenating? The ideal result would be something like the following:

 

ID1ID2Distinct IDPerson
53153001Bob
53100531Sally
8886288862Sheila
8886288862Sheila

 

Maybe there is a way to do this with RANKX()? But the resulting “distinct ID” needs to be reproducible in other tables that may not include all of the same “persons”. (Also I can not do this in Power Query at this stage…)

1 ACCEPTED SOLUTION
MJEnnis
Helper V
Helper V

This question was in spam limbo for a few days, and in the meantime, I figured it out. Will share my solution in case anyone else has the same problem.

 

CONCATENATE will work if at least one of the IDs is always of the same length. But it will not work if they both have variable lenths (as is my case).

 

RANKX will work to create a unique ID within one table. But if this unique ID has to be replicated in other already existing tables it will only work if all tables contain all combinations of the two IDs.

For this situation, I found this simple solution on the web: 

Distinct ID = RANKX(TABLE, [longer ID] * 100000 + [shorter ID]) 

 

All you are doing in effect is adding five zeros to the end of the longer ID and then adding the shorter ID to the resulting number. The results are then ranked in descending order (or in ascending order, as needed) with no tie breakers, so that every instance of, for example, Sally with the IDs 5 and 31 gets the same unique identifer.

 

The problem is if one table does not have that combination, then the ranks get shifted accordingly and the IDs no longer match across tables.

 

The best solution I have found is to just drop the rank function and keep it to simple math.

 

Distinct ID = [longer ID] * 100000 + [shorter ID]

 

So you get something like this:

 

5300001 = Bob

500031 = Sally

 

This way the Distinct ID is always replicable, as long as ID1 and ID2 are unique indexes.

 

View solution in original post

2 REPLIES 2
MJEnnis
Helper V
Helper V

This question was in spam limbo for a few days, and in the meantime, I figured it out. Will share my solution in case anyone else has the same problem.

 

CONCATENATE will work if at least one of the IDs is always of the same length. But it will not work if they both have variable lenths (as is my case).

 

RANKX will work to create a unique ID within one table. But if this unique ID has to be replicated in other already existing tables it will only work if all tables contain all combinations of the two IDs.

For this situation, I found this simple solution on the web: 

Distinct ID = RANKX(TABLE, [longer ID] * 100000 + [shorter ID]) 

 

All you are doing in effect is adding five zeros to the end of the longer ID and then adding the shorter ID to the resulting number. The results are then ranked in descending order (or in ascending order, as needed) with no tie breakers, so that every instance of, for example, Sally with the IDs 5 and 31 gets the same unique identifer.

 

The problem is if one table does not have that combination, then the ranks get shifted accordingly and the IDs no longer match across tables.

 

The best solution I have found is to just drop the rank function and keep it to simple math.

 

Distinct ID = [longer ID] * 100000 + [shorter ID]

 

So you get something like this:

 

5300001 = Bob

500031 = Sally

 

This way the Distinct ID is always replicable, as long as ID1 and ID2 are unique indexes.

 

You just have to make sure that your multipler has enough zeros to contain the entire "shorter" ID. In my case, it is inconceivable that ID2 would ever be longer than 4 digits. To play it safe, I used 100000 as my multiplier. 

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.