Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Daven
Helper II
Helper II

Check similarities between two columns in percentage

Hi,

 

I have two columns and would like to create a new column that would show how similar are the two columns. The order of the words does not matter. If all words are contained in Name2 from Name1, it would be 100% match. i.e below

 

Name1                                      Name2                                     Similarity_% (New column)
Pepsi CoPepsi Co100%
Co PepsiPepsi Co100%
Pepsi Co LTDPepsi Co90%
Pepsi CoCola0%

Thanks,

 

Daven

1 ACCEPTED SOLUTION
AlexisOlson
Super User
Super User

If you're doing this in the query editor, you can write a custom column with this formula

 

let
    a = List.Distinct(Text.Split([Name1], " ")),
    b = List.Distinct(Text.Split([Name2], " "))
in
    List.Count(List.Intersect({a, b})) / List.Count(a)

 

This gives the fraction of the number of words that the names share divided by the number of words in Name1 (so 67% for your 3rd row--I don't know where your 90% came from).

This is likely possible in DAX but more difficult without an analog for Text.Split.

View solution in original post

1 REPLY 1
AlexisOlson
Super User
Super User

If you're doing this in the query editor, you can write a custom column with this formula

 

let
    a = List.Distinct(Text.Split([Name1], " ")),
    b = List.Distinct(Text.Split([Name2], " "))
in
    List.Count(List.Intersect({a, b})) / List.Count(a)

 

This gives the fraction of the number of words that the names share divided by the number of words in Name1 (so 67% for your 3rd row--I don't know where your 90% came from).

This is likely possible in DAX but more difficult without an analog for Text.Split.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.