Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
I have a table that looks something like this (but the names are not present, just the IDs):
ID1 | ID2 | Person |
53 | 1 | Bob |
5 | 31 | Sally |
888 | 62 | Sheila |
888 | 62 | Sheila |
999 | 50 | Frank |
999 | 77 | Frank |
14 | 99 | Mike |
102 | 88 | Mike |
ID1 and ID2 are both unique indexes that index the “persons” in other tables.
Some of the “persons” have multiple entries (like “Sheila”), others do not (like “Bob”). For other purposes, I need to create a new “distinct ID” to differentiate these multiple entries, as they relate to different events.
The obvious solution to me is a code like this:
Var TableNew = ADDCOLUMNS(TableOld,
"Distinct ID", CONCATENATE([ID1],[ID2])
)
However, on rare occasions, this code would give the same “distinct ID” to two different people, such as this:
ID1 | ID2 | Distinct ID | Person |
53 | 1 | 531 | Bob |
5 | 31 | 531 | Sally |
888 | 62 | 88862 | Sheila |
888 | 62 | 88862 | Sheila |
Is there an easy way to specify a minimum length for ID1 and ID2, filling the missing slots with zeros, before concatenating? The ideal result would be something like the following:
ID1 | ID2 | Distinct ID | Person |
53 | 1 | 53001 | Bob |
5 | 31 | 00531 | Sally |
888 | 62 | 88862 | Sheila |
888 | 62 | 88862 | Sheila |
Maybe there is a way to do this with RANKX()? But the resulting “distinct ID” needs to be reproducible in other tables that may not include all of the same “persons”. (Also I can not do this in Power Query at this stage…)
Solved! Go to Solution.
This question was in spam limbo for a few days, and in the meantime, I figured it out. Will share my solution in case anyone else has the same problem.
CONCATENATE will work if at least one of the IDs is always of the same length. But it will not work if they both have variable lenths (as is my case).
RANKX will work to create a unique ID within one table. But if this unique ID has to be replicated in other already existing tables it will only work if all tables contain all combinations of the two IDs.
For this situation, I found this simple solution on the web:
Distinct ID = RANKX(TABLE, [longer ID] * 100000 + [shorter ID])
All you are doing in effect is adding five zeros to the end of the longer ID and then adding the shorter ID to the resulting number. The results are then ranked in descending order (or in ascending order, as needed) with no tie breakers, so that every instance of, for example, Sally with the IDs 5 and 31 gets the same unique identifer.
The problem is if one table does not have that combination, then the ranks get shifted accordingly and the IDs no longer match across tables.
The best solution I have found is to just drop the rank function and keep it to simple math.
Distinct ID = [longer ID] * 100000 + [shorter ID]
So you get something like this:
5300001 = Bob
500031 = Sally
This way the Distinct ID is always replicable, as long as ID1 and ID2 are unique indexes.
This question was in spam limbo for a few days, and in the meantime, I figured it out. Will share my solution in case anyone else has the same problem.
CONCATENATE will work if at least one of the IDs is always of the same length. But it will not work if they both have variable lenths (as is my case).
RANKX will work to create a unique ID within one table. But if this unique ID has to be replicated in other already existing tables it will only work if all tables contain all combinations of the two IDs.
For this situation, I found this simple solution on the web:
Distinct ID = RANKX(TABLE, [longer ID] * 100000 + [shorter ID])
All you are doing in effect is adding five zeros to the end of the longer ID and then adding the shorter ID to the resulting number. The results are then ranked in descending order (or in ascending order, as needed) with no tie breakers, so that every instance of, for example, Sally with the IDs 5 and 31 gets the same unique identifer.
The problem is if one table does not have that combination, then the ranks get shifted accordingly and the IDs no longer match across tables.
The best solution I have found is to just drop the rank function and keep it to simple math.
Distinct ID = [longer ID] * 100000 + [shorter ID]
So you get something like this:
5300001 = Bob
500031 = Sally
This way the Distinct ID is always replicable, as long as ID1 and ID2 are unique indexes.
You just have to make sure that your multipler has enough zeros to contain the entire "shorter" ID. In my case, it is inconceivable that ID2 would ever be longer than 4 digits. To play it safe, I used 100000 as my multiplier.
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
46 | |
26 | |
22 | |
12 | |
8 |
User | Count |
---|---|
74 | |
51 | |
45 | |
16 | |
12 |