Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
peterpm
Frequent Visitor

Locating unique values in two columns and isolating them, by a common ID?

Hello all, I'm trying to accomplish something that should be easy to in concept but unsure on how to proceed.

 

I have the following example table:

 

IDLA_Codes NY_Codes 
1a,ba,c,d
2b,gf

 

I'd need to split and isolate each code on its own row, and afterwards, compare both rows for each ID to find the unique codes, and separate them into 2 other columns.

 

Desired output:

 

IDLA_Codes NY_Codes Unique_LA Unique_NY 
1aabc
1bc d
1 d  
2bfbf
2gg  

 

Splitting the codes was done by splitting that column directly within Transform in Power Editor.

However, isolating the unique codes is proving to be my main problem.

 

I've tried calculated columns with:

=EXCEPT( SUMMARIZE( Table1, LA_Codes), SUMMARIZE(Table1, NY_Codes))

 

But that will return the values prior to the data transformation took place, not after, filling the new column with "a,b" and so on. SUMMARIZECOLUMNS will only work if there's only a single unique value, otherwise outputs an error. Neither of these methods take into account the ID, instead comparing the entire column regardless blindly.

 

Is there any way to accomplish this in Power BI using DAX?

Thank you for any help

1 ACCEPTED SOLUTION
barritown
Super User
Super User

Hi @peterpm,

I could propose a 2-step solution. 

Step 1. Let's add two similar calculation columns which contain the unique codes for each entity separated with a comma. Like that:

barritown_0-1692014997331.png

Step 2. Now let's split them by item and create an additional column for proper sorting.

barritown_1-1692015209996.png

 

Below you can find all the [DAX] code in plain text.

 

1. Unique_LA calculated column.

Unique_LA = 
VAR WordCountLA = LEN ( [LA_Codes] ) - LEN ( SUBSTITUTE ( [LA_Codes], ",", "" ) ) + 1
VAR TempTableLA = GENERATE ( SELECTCOLUMNS ( { [LA_Codes] }, "Code", [Value] ), GENERATESERIES ( 1, WordCountLA, 1 ) )
VAR SplitLA = ADDCOLUMNS ( TempTableLA, "Item", TRIM ( PATHITEM ( SUBSTITUTE ( [Code], ",", "|" ), [Value] ) ) )
VAR WordCountNY = LEN ( [NY_Codes] ) - LEN ( SUBSTITUTE ( [NY_Codes], ",", "" ) ) + 1
VAR TempTableNY = GENERATE ( SELECTCOLUMNS ( { [NY_Codes] }, "Code", [Value] ), GENERATESERIES ( 1, WordCountNY, 1 ) )
VAR SplitNY = ADDCOLUMNS ( TempTableNY, "Item", TRIM ( PATHITEM ( SUBSTITUTE ( [Code], ",", "|" ), [Value] ) ) )
VAR ResultLA = EXCEPT ( SELECTCOLUMNS ( SplitLA, "Item", [Item] ), SELECTCOLUMNS ( SplitNY, "Item", [Item] ) )
RETURN CONCATENATEX ( ResultLA, [Item], "," )

2. Unique_NY calculated column.

Unique_NY = 
VAR WordCountLA = LEN ( [LA_Codes] ) - LEN ( SUBSTITUTE ( [LA_Codes], ",", "" ) ) + 1
VAR TempTableLA = GENERATE ( SELECTCOLUMNS ( { [LA_Codes] }, "Code", [Value] ), GENERATESERIES ( 1, WordCountLA, 1 ) )
VAR SplitLA = ADDCOLUMNS ( TempTableLA, "Item", TRIM ( PATHITEM ( SUBSTITUTE ( [Code], ",", "|" ), [Value] ) ) )
VAR WordCountNY = LEN ( [NY_Codes] ) - LEN ( SUBSTITUTE ( [NY_Codes], ",", "" ) ) + 1
VAR TempTableNY = GENERATE ( SELECTCOLUMNS ( { [NY_Codes] }, "Code", [Value] ), GENERATESERIES ( 1, WordCountNY, 1 ) )
VAR SplitNY = ADDCOLUMNS ( TempTableNY, "Item", TRIM ( PATHITEM ( SUBSTITUTE ( [Code], ",", "|" ), [Value] ) ) )
VAR ResultNY = EXCEPT ( SELECTCOLUMNS ( SplitNY, "Item", [Item] ), SELECTCOLUMNS ( SplitLA, "Item", [Item] ) )
RETURN CONCATENATEX ( ResultNY, [Item], "," )

3. Result table.

Table = 
VAR WordCount =
    ADDCOLUMNS ( 
        Data,
        "@Word Count",
        MAX ( LEN ( [LA_Codes] ) - LEN ( SUBSTITUTE ( [LA_Codes], ",", "" ) ), LEN ( [NY_Codes] ) - LEN ( SUBSTITUTE ( [NY_Codes], ",", "" ) ) ) + 1 )
VAR MaxWordCount = MAXX ( WordCount, [@Word Count] )
VAR TempTable = FILTER ( GENERATE ( WordCount, GENERATESERIES ( 1, MaxWordCount, 1 ) ), [Value] <= [@Word Count] )
VAR SplitTextByNumber = 
    ADDCOLUMNS ( 
        TempTable,
        "LA_Codes_", TRIM ( PATHITEM ( SUBSTITUTE ( [LA_Codes], ",", "|" ), [Value] ) ),
        "NY_Codes_", TRIM ( PATHITEM ( SUBSTITUTE ( [NY_Codes], ",", "|" ), [Value] ) ),
        "Unique_LA_", TRIM ( PATHITEM ( SUBSTITUTE ( [Unique_LA], ",", "|" ), [Value] ) ), 
        "Unique_NY_", TRIM ( PATHITEM ( SUBSTITUTE ( [Unique_NY], ",", "|" ), [Value] ) ),
        "SortOrder",  [ID]  & "-" & [Value] )
RETURN SplitTextByNumber

Half of the credits go to @AntrikshSharma for this solution - https://community.fabric.microsoft.com/t5/DAX-Commands-and-Tips/Calculated-column-with-comma-delimit...

Best Regards,

Alexander

My YouTube vlog in English

My YouTube vlog in Russian

View solution in original post

3 REPLIES 3
v-binbinyu-msft
Community Support
Community Support

Hi @peterpm ,

Could you tell me if your problem has been solved? If it is, kindly Accept it as the solution. More people will benefit from it. Or if you are still confused about it, please provide me with more details about your table and your problem or share me with your pbix file after removing sensitive data.

 

Best regards,
Community Support Team_Binbin Yu
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

barritown
Super User
Super User

Hi @peterpm,

I could propose a 2-step solution. 

Step 1. Let's add two similar calculation columns which contain the unique codes for each entity separated with a comma. Like that:

barritown_0-1692014997331.png

Step 2. Now let's split them by item and create an additional column for proper sorting.

barritown_1-1692015209996.png

 

Below you can find all the [DAX] code in plain text.

 

1. Unique_LA calculated column.

Unique_LA = 
VAR WordCountLA = LEN ( [LA_Codes] ) - LEN ( SUBSTITUTE ( [LA_Codes], ",", "" ) ) + 1
VAR TempTableLA = GENERATE ( SELECTCOLUMNS ( { [LA_Codes] }, "Code", [Value] ), GENERATESERIES ( 1, WordCountLA, 1 ) )
VAR SplitLA = ADDCOLUMNS ( TempTableLA, "Item", TRIM ( PATHITEM ( SUBSTITUTE ( [Code], ",", "|" ), [Value] ) ) )
VAR WordCountNY = LEN ( [NY_Codes] ) - LEN ( SUBSTITUTE ( [NY_Codes], ",", "" ) ) + 1
VAR TempTableNY = GENERATE ( SELECTCOLUMNS ( { [NY_Codes] }, "Code", [Value] ), GENERATESERIES ( 1, WordCountNY, 1 ) )
VAR SplitNY = ADDCOLUMNS ( TempTableNY, "Item", TRIM ( PATHITEM ( SUBSTITUTE ( [Code], ",", "|" ), [Value] ) ) )
VAR ResultLA = EXCEPT ( SELECTCOLUMNS ( SplitLA, "Item", [Item] ), SELECTCOLUMNS ( SplitNY, "Item", [Item] ) )
RETURN CONCATENATEX ( ResultLA, [Item], "," )

2. Unique_NY calculated column.

Unique_NY = 
VAR WordCountLA = LEN ( [LA_Codes] ) - LEN ( SUBSTITUTE ( [LA_Codes], ",", "" ) ) + 1
VAR TempTableLA = GENERATE ( SELECTCOLUMNS ( { [LA_Codes] }, "Code", [Value] ), GENERATESERIES ( 1, WordCountLA, 1 ) )
VAR SplitLA = ADDCOLUMNS ( TempTableLA, "Item", TRIM ( PATHITEM ( SUBSTITUTE ( [Code], ",", "|" ), [Value] ) ) )
VAR WordCountNY = LEN ( [NY_Codes] ) - LEN ( SUBSTITUTE ( [NY_Codes], ",", "" ) ) + 1
VAR TempTableNY = GENERATE ( SELECTCOLUMNS ( { [NY_Codes] }, "Code", [Value] ), GENERATESERIES ( 1, WordCountNY, 1 ) )
VAR SplitNY = ADDCOLUMNS ( TempTableNY, "Item", TRIM ( PATHITEM ( SUBSTITUTE ( [Code], ",", "|" ), [Value] ) ) )
VAR ResultNY = EXCEPT ( SELECTCOLUMNS ( SplitNY, "Item", [Item] ), SELECTCOLUMNS ( SplitLA, "Item", [Item] ) )
RETURN CONCATENATEX ( ResultNY, [Item], "," )

3. Result table.

Table = 
VAR WordCount =
    ADDCOLUMNS ( 
        Data,
        "@Word Count",
        MAX ( LEN ( [LA_Codes] ) - LEN ( SUBSTITUTE ( [LA_Codes], ",", "" ) ), LEN ( [NY_Codes] ) - LEN ( SUBSTITUTE ( [NY_Codes], ",", "" ) ) ) + 1 )
VAR MaxWordCount = MAXX ( WordCount, [@Word Count] )
VAR TempTable = FILTER ( GENERATE ( WordCount, GENERATESERIES ( 1, MaxWordCount, 1 ) ), [Value] <= [@Word Count] )
VAR SplitTextByNumber = 
    ADDCOLUMNS ( 
        TempTable,
        "LA_Codes_", TRIM ( PATHITEM ( SUBSTITUTE ( [LA_Codes], ",", "|" ), [Value] ) ),
        "NY_Codes_", TRIM ( PATHITEM ( SUBSTITUTE ( [NY_Codes], ",", "|" ), [Value] ) ),
        "Unique_LA_", TRIM ( PATHITEM ( SUBSTITUTE ( [Unique_LA], ",", "|" ), [Value] ) ), 
        "Unique_NY_", TRIM ( PATHITEM ( SUBSTITUTE ( [Unique_NY], ",", "|" ), [Value] ) ),
        "SortOrder",  [ID]  & "-" & [Value] )
RETURN SplitTextByNumber

Half of the credits go to @AntrikshSharma for this solution - https://community.fabric.microsoft.com/t5/DAX-Commands-and-Tips/Calculated-column-with-comma-delimit...

Best Regards,

Alexander

My YouTube vlog in English

My YouTube vlog in Russian

Hello @barritown !

Sorry but I missed your reply until recently, in the meantime we managed to sort out the issue in a different way but with your help I went back to the original report and applied this solution. Works like a charm despite being a lot of code I wasn't expecting, and allowed for handling the resulting output in an easier way too. We're now going forward with your help.

Thank you so much!

Helpful resources

Announcements
RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.