Solved: Re: Locating unique values in two columns and isol...

peterpm · ‎08-08-2023

Hello all, I'm trying to accomplish something that should be easy to in concept but unsure on how to proceed.

I have the following example table:

ID	LA_Codes	NY_Codes
1	a,b	a,c,d
2	b,g	f

I'd need to split and isolate each code on its own row, and afterwards, compare both rows for each ID to find the unique codes, and separate them into 2 other columns.

Desired output:

ID	LA_Codes	NY_Codes	Unique_LA	Unique_NY
1	a	a	b	c
1	b	c		d
1		d
2	b	f	b	f
2	g	g

Splitting the codes was done by splitting that column directly within Transform in Power Editor.

However, isolating the unique codes is proving to be my main problem.

I've tried calculated columns with:

=EXCEPT( SUMMARIZE( Table1, LA_Codes), SUMMARIZE(Table1, NY_Codes))

But that will return the values prior to the data transformation took place, not after, filling the new column with "a,b" and so on. SUMMARIZECOLUMNS will only work if there's only a single unique value, otherwise outputs an error. Neither of these methods take into account the ID, instead comparing the entire column regardless blindly.

Is there any way to accomplish this in Power BI using DAX?

Thank you for any help

barritown · ‎08-14-2023

Hi @peterpm,

I could propose a 2-step solution.

Step 1. Let's add two similar calculation columns which contain the unique codes for each entity separated with a comma. Like that:

Step 2. Now let's split them by item and create an additional column for proper sorting.

Below you can find all the [DAX] code in plain text.

1. Unique_LA calculated column.

Unique_LA = 
VAR WordCountLA = LEN ( [LA_Codes] ) - LEN ( SUBSTITUTE ( [LA_Codes], ",", "" ) ) + 1
VAR TempTableLA = GENERATE ( SELECTCOLUMNS ( { [LA_Codes] }, "Code", [Value] ), GENERATESERIES ( 1, WordCountLA, 1 ) )
VAR SplitLA = ADDCOLUMNS ( TempTableLA, "Item", TRIM ( PATHITEM ( SUBSTITUTE ( [Code], ",", "|" ), [Value] ) ) )
VAR WordCountNY = LEN ( [NY_Codes] ) - LEN ( SUBSTITUTE ( [NY_Codes], ",", "" ) ) + 1
VAR TempTableNY = GENERATE ( SELECTCOLUMNS ( { [NY_Codes] }, "Code", [Value] ), GENERATESERIES ( 1, WordCountNY, 1 ) )
VAR SplitNY = ADDCOLUMNS ( TempTableNY, "Item", TRIM ( PATHITEM ( SUBSTITUTE ( [Code], ",", "|" ), [Value] ) ) )
VAR ResultLA = EXCEPT ( SELECTCOLUMNS ( SplitLA, "Item", [Item] ), SELECTCOLUMNS ( SplitNY, "Item", [Item] ) )
RETURN CONCATENATEX ( ResultLA, [Item], "," )

2. Unique_NY calculated column.

Unique_NY = 
VAR WordCountLA = LEN ( [LA_Codes] ) - LEN ( SUBSTITUTE ( [LA_Codes], ",", "" ) ) + 1
VAR TempTableLA = GENERATE ( SELECTCOLUMNS ( { [LA_Codes] }, "Code", [Value] ), GENERATESERIES ( 1, WordCountLA, 1 ) )
VAR SplitLA = ADDCOLUMNS ( TempTableLA, "Item", TRIM ( PATHITEM ( SUBSTITUTE ( [Code], ",", "|" ), [Value] ) ) )
VAR WordCountNY = LEN ( [NY_Codes] ) - LEN ( SUBSTITUTE ( [NY_Codes], ",", "" ) ) + 1
VAR TempTableNY = GENERATE ( SELECTCOLUMNS ( { [NY_Codes] }, "Code", [Value] ), GENERATESERIES ( 1, WordCountNY, 1 ) )
VAR SplitNY = ADDCOLUMNS ( TempTableNY, "Item", TRIM ( PATHITEM ( SUBSTITUTE ( [Code], ",", "|" ), [Value] ) ) )
VAR ResultNY = EXCEPT ( SELECTCOLUMNS ( SplitNY, "Item", [Item] ), SELECTCOLUMNS ( SplitLA, "Item", [Item] ) )
RETURN CONCATENATEX ( ResultNY, [Item], "," )

3. Result table.

Table = 
VAR WordCount =
    ADDCOLUMNS ( 
        Data,
        "@Word Count",
        MAX ( LEN ( [LA_Codes] ) - LEN ( SUBSTITUTE ( [LA_Codes], ",", "" ) ), LEN ( [NY_Codes] ) - LEN ( SUBSTITUTE ( [NY_Codes], ",", "" ) ) ) + 1 )
VAR MaxWordCount = MAXX ( WordCount, [@Word Count] )
VAR TempTable = FILTER ( GENERATE ( WordCount, GENERATESERIES ( 1, MaxWordCount, 1 ) ), [Value] <= [@Word Count] )
VAR SplitTextByNumber = 
    ADDCOLUMNS ( 
        TempTable,
        "LA_Codes_", TRIM ( PATHITEM ( SUBSTITUTE ( [LA_Codes], ",", "|" ), [Value] ) ),
        "NY_Codes_", TRIM ( PATHITEM ( SUBSTITUTE ( [NY_Codes], ",", "|" ), [Value] ) ),
        "Unique_LA_", TRIM ( PATHITEM ( SUBSTITUTE ( [Unique_LA], ",", "|" ), [Value] ) ), 
        "Unique_NY_", TRIM ( PATHITEM ( SUBSTITUTE ( [Unique_NY], ",", "|" ), [Value] ) ),
        "SortOrder",  [ID]  & "-" & [Value] )
RETURN SplitTextByNumber

Half of the credits go to @AntrikshSharma for this solution - https://community.fabric.microsoft.com/t5/DAX-Commands-and-Tips/Calculated-column-with-comma-delimit...

Best Regards,

Alexander

My YouTube vlog in English

My YouTube vlog in Russian

View solution in original post

v-binbinyu-msft · ‎09-03-2023

Hi @peterpm ,

Could you tell me if your problem has been solved? If it is, kindly Accept it as the solution. More people will benefit from it. Or if you are still confused about it, please provide me with more details about your table and your problem or share me with your pbix file after removing sensitive data.

Best regards,
Community Support Team_Binbin Yu
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

barritown · ‎08-14-2023