Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi all
Can Power BI's DAX language help determine how similar or different the names of thousands of customers are in a single column? The names are entered in various formats with different spacing or punctuation.
By calculating a similarity percentage, I want to identify which names are completely unique and which ones are similar to others.
This will allow me to manually investigate and correct the similar names.
I need to add , that this column exist in an Excel file , so maybe we could implement this out Power Query , if some one know another way , please share it her 🙂
Thanks in advance
Regards
Hi @Hercules78 ,
I created some data:
In DAX/Power Query, we can't check the result of each character in two columns in comparison, for example, ABC and ACB, if you don't consider this character order, you can try the following:
1. Add Column.
=Text.ToList([Group2])
Extends the value to the current table.
if Text.Contains([Group],[Name2 List]) then 1 else 0
Group BY:
Text.Length([Group])
[Match]/[Custom]
Set [Custom.1] -- Change Type -- Percentage
2. Result:
This is the related document, you can view this content:
Fuzzy Clustering in Power BI using Power Query: Finding similar values - RADACAD
Best Regards,
Liu Yang
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly
Thanks a lot
I just want to add that I dont have 2 columns , I have only 1 column and I want to check similarity betweens rows !!
Thansk again
There are scripts to implement Levenshtein in SQL. Feel free to adapt these for Power Query.
You can also do a Pearson correlation of a column against itself.
There is no official SOUNDEX or Levenshtein etc. implementation in DAX. You can consider running Python or R scripts inside of Power BI for that.
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
104 | |
101 | |
79 | |
72 | |
64 |
User | Count |
---|---|
142 | |
108 | |
101 | |
81 | |
74 |