Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and a 50 percent discount on exams.
Get startedEarn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.
Hello all, I'm trying to accomplish something that should be easy to in concept but unsure on how to proceed.
I have the following example table:
ID | LA_Codes | NY_Codes |
1 | a,b | a,c,d |
2 | b,g | f |
I'd need to split and isolate each code on its own row, and afterwards, compare both rows for each ID to find the unique codes, and separate them into 2 other columns.
Desired output:
ID | LA_Codes | NY_Codes | Unique_LA | Unique_NY |
1 | a | a | b | c |
1 | b | c | d | |
1 | d | |||
2 | b | f | b | f |
2 | g | g |
Splitting the codes was done by splitting that column directly within Transform in Power Editor.
However, isolating the unique codes is proving to be my main problem.
I've tried calculated columns with:
=EXCEPT( SUMMARIZE( Table1, LA_Codes), SUMMARIZE(Table1, NY_Codes))
But that will return the values prior to the data transformation took place, not after, filling the new column with "a,b" and so on. SUMMARIZECOLUMNS will only work if there's only a single unique value, otherwise outputs an error. Neither of these methods take into account the ID, instead comparing the entire column regardless blindly.
Is there any way to accomplish this in Power BI using DAX?
Thank you for any help
Solved! Go to Solution.
Hi @peterpm,
I could propose a 2-step solution.
Step 1. Let's add two similar calculation columns which contain the unique codes for each entity separated with a comma. Like that:
Step 2. Now let's split them by item and create an additional column for proper sorting.
Below you can find all the [DAX] code in plain text.
1. Unique_LA calculated column.
Unique_LA =
VAR WordCountLA = LEN ( [LA_Codes] ) - LEN ( SUBSTITUTE ( [LA_Codes], ",", "" ) ) + 1
VAR TempTableLA = GENERATE ( SELECTCOLUMNS ( { [LA_Codes] }, "Code", [Value] ), GENERATESERIES ( 1, WordCountLA, 1 ) )
VAR SplitLA = ADDCOLUMNS ( TempTableLA, "Item", TRIM ( PATHITEM ( SUBSTITUTE ( [Code], ",", "|" ), [Value] ) ) )
VAR WordCountNY = LEN ( [NY_Codes] ) - LEN ( SUBSTITUTE ( [NY_Codes], ",", "" ) ) + 1
VAR TempTableNY = GENERATE ( SELECTCOLUMNS ( { [NY_Codes] }, "Code", [Value] ), GENERATESERIES ( 1, WordCountNY, 1 ) )
VAR SplitNY = ADDCOLUMNS ( TempTableNY, "Item", TRIM ( PATHITEM ( SUBSTITUTE ( [Code], ",", "|" ), [Value] ) ) )
VAR ResultLA = EXCEPT ( SELECTCOLUMNS ( SplitLA, "Item", [Item] ), SELECTCOLUMNS ( SplitNY, "Item", [Item] ) )
RETURN CONCATENATEX ( ResultLA, [Item], "," )
2. Unique_NY calculated column.
Unique_NY =
VAR WordCountLA = LEN ( [LA_Codes] ) - LEN ( SUBSTITUTE ( [LA_Codes], ",", "" ) ) + 1
VAR TempTableLA = GENERATE ( SELECTCOLUMNS ( { [LA_Codes] }, "Code", [Value] ), GENERATESERIES ( 1, WordCountLA, 1 ) )
VAR SplitLA = ADDCOLUMNS ( TempTableLA, "Item", TRIM ( PATHITEM ( SUBSTITUTE ( [Code], ",", "|" ), [Value] ) ) )
VAR WordCountNY = LEN ( [NY_Codes] ) - LEN ( SUBSTITUTE ( [NY_Codes], ",", "" ) ) + 1
VAR TempTableNY = GENERATE ( SELECTCOLUMNS ( { [NY_Codes] }, "Code", [Value] ), GENERATESERIES ( 1, WordCountNY, 1 ) )
VAR SplitNY = ADDCOLUMNS ( TempTableNY, "Item", TRIM ( PATHITEM ( SUBSTITUTE ( [Code], ",", "|" ), [Value] ) ) )
VAR ResultNY = EXCEPT ( SELECTCOLUMNS ( SplitNY, "Item", [Item] ), SELECTCOLUMNS ( SplitLA, "Item", [Item] ) )
RETURN CONCATENATEX ( ResultNY, [Item], "," )
3. Result table.
Table =
VAR WordCount =
ADDCOLUMNS (
Data,
"@Word Count",
MAX ( LEN ( [LA_Codes] ) - LEN ( SUBSTITUTE ( [LA_Codes], ",", "" ) ), LEN ( [NY_Codes] ) - LEN ( SUBSTITUTE ( [NY_Codes], ",", "" ) ) ) + 1 )
VAR MaxWordCount = MAXX ( WordCount, [@Word Count] )
VAR TempTable = FILTER ( GENERATE ( WordCount, GENERATESERIES ( 1, MaxWordCount, 1 ) ), [Value] <= [@Word Count] )
VAR SplitTextByNumber =
ADDCOLUMNS (
TempTable,
"LA_Codes_", TRIM ( PATHITEM ( SUBSTITUTE ( [LA_Codes], ",", "|" ), [Value] ) ),
"NY_Codes_", TRIM ( PATHITEM ( SUBSTITUTE ( [NY_Codes], ",", "|" ), [Value] ) ),
"Unique_LA_", TRIM ( PATHITEM ( SUBSTITUTE ( [Unique_LA], ",", "|" ), [Value] ) ),
"Unique_NY_", TRIM ( PATHITEM ( SUBSTITUTE ( [Unique_NY], ",", "|" ), [Value] ) ),
"SortOrder", [ID] & "-" & [Value] )
RETURN SplitTextByNumber
Half of the credits go to @AntrikshSharma for this solution - https://community.fabric.microsoft.com/t5/DAX-Commands-and-Tips/Calculated-column-with-comma-delimit...
Best Regards,
Alexander
Hi @peterpm ,
Could you tell me if your problem has been solved? If it is, kindly Accept it as the solution. More people will benefit from it. Or if you are still confused about it, please provide me with more details about your table and your problem or share me with your pbix file after removing sensitive data.
Best regards,
Community Support Team_Binbin Yu
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Hi @peterpm,
I could propose a 2-step solution.
Step 1. Let's add two similar calculation columns which contain the unique codes for each entity separated with a comma. Like that:
Step 2. Now let's split them by item and create an additional column for proper sorting.
Below you can find all the [DAX] code in plain text.
1. Unique_LA calculated column.
Unique_LA =
VAR WordCountLA = LEN ( [LA_Codes] ) - LEN ( SUBSTITUTE ( [LA_Codes], ",", "" ) ) + 1
VAR TempTableLA = GENERATE ( SELECTCOLUMNS ( { [LA_Codes] }, "Code", [Value] ), GENERATESERIES ( 1, WordCountLA, 1 ) )
VAR SplitLA = ADDCOLUMNS ( TempTableLA, "Item", TRIM ( PATHITEM ( SUBSTITUTE ( [Code], ",", "|" ), [Value] ) ) )
VAR WordCountNY = LEN ( [NY_Codes] ) - LEN ( SUBSTITUTE ( [NY_Codes], ",", "" ) ) + 1
VAR TempTableNY = GENERATE ( SELECTCOLUMNS ( { [NY_Codes] }, "Code", [Value] ), GENERATESERIES ( 1, WordCountNY, 1 ) )
VAR SplitNY = ADDCOLUMNS ( TempTableNY, "Item", TRIM ( PATHITEM ( SUBSTITUTE ( [Code], ",", "|" ), [Value] ) ) )
VAR ResultLA = EXCEPT ( SELECTCOLUMNS ( SplitLA, "Item", [Item] ), SELECTCOLUMNS ( SplitNY, "Item", [Item] ) )
RETURN CONCATENATEX ( ResultLA, [Item], "," )
2. Unique_NY calculated column.
Unique_NY =
VAR WordCountLA = LEN ( [LA_Codes] ) - LEN ( SUBSTITUTE ( [LA_Codes], ",", "" ) ) + 1
VAR TempTableLA = GENERATE ( SELECTCOLUMNS ( { [LA_Codes] }, "Code", [Value] ), GENERATESERIES ( 1, WordCountLA, 1 ) )
VAR SplitLA = ADDCOLUMNS ( TempTableLA, "Item", TRIM ( PATHITEM ( SUBSTITUTE ( [Code], ",", "|" ), [Value] ) ) )
VAR WordCountNY = LEN ( [NY_Codes] ) - LEN ( SUBSTITUTE ( [NY_Codes], ",", "" ) ) + 1
VAR TempTableNY = GENERATE ( SELECTCOLUMNS ( { [NY_Codes] }, "Code", [Value] ), GENERATESERIES ( 1, WordCountNY, 1 ) )
VAR SplitNY = ADDCOLUMNS ( TempTableNY, "Item", TRIM ( PATHITEM ( SUBSTITUTE ( [Code], ",", "|" ), [Value] ) ) )
VAR ResultNY = EXCEPT ( SELECTCOLUMNS ( SplitNY, "Item", [Item] ), SELECTCOLUMNS ( SplitLA, "Item", [Item] ) )
RETURN CONCATENATEX ( ResultNY, [Item], "," )
3. Result table.
Table =
VAR WordCount =
ADDCOLUMNS (
Data,
"@Word Count",
MAX ( LEN ( [LA_Codes] ) - LEN ( SUBSTITUTE ( [LA_Codes], ",", "" ) ), LEN ( [NY_Codes] ) - LEN ( SUBSTITUTE ( [NY_Codes], ",", "" ) ) ) + 1 )
VAR MaxWordCount = MAXX ( WordCount, [@Word Count] )
VAR TempTable = FILTER ( GENERATE ( WordCount, GENERATESERIES ( 1, MaxWordCount, 1 ) ), [Value] <= [@Word Count] )
VAR SplitTextByNumber =
ADDCOLUMNS (
TempTable,
"LA_Codes_", TRIM ( PATHITEM ( SUBSTITUTE ( [LA_Codes], ",", "|" ), [Value] ) ),
"NY_Codes_", TRIM ( PATHITEM ( SUBSTITUTE ( [NY_Codes], ",", "|" ), [Value] ) ),
"Unique_LA_", TRIM ( PATHITEM ( SUBSTITUTE ( [Unique_LA], ",", "|" ), [Value] ) ),
"Unique_NY_", TRIM ( PATHITEM ( SUBSTITUTE ( [Unique_NY], ",", "|" ), [Value] ) ),
"SortOrder", [ID] & "-" & [Value] )
RETURN SplitTextByNumber
Half of the credits go to @AntrikshSharma for this solution - https://community.fabric.microsoft.com/t5/DAX-Commands-and-Tips/Calculated-column-with-comma-delimit...
Best Regards,
Alexander
Hello @barritown !
Sorry but I missed your reply until recently, in the meantime we managed to sort out the issue in a different way but with your help I went back to the original report and applied this solution. Works like a charm despite being a lot of code I wasn't expecting, and allowed for handling the resulting output in an easier way too. We're now going forward with your help.
Thank you so much!