Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
mbegg
Helper II
Helper II

Identify "very-close" duplicates

I have a list of 19,000 business names with some "very-close" duplicates. 

 

E.g. XYZ Pty Ltd and XYZ Pty. Ltd. or ABCDE and ABCD

 

There is no logic to the differences so I can't just find & replace all the . from Pty. Ltd. and fix all of the duplicates. 

 

Is there a way to identify the "very-close" duplicates. I am thinking of function that would identify if the current value is the same as another value in the list except for 1 or 2 or 3 or x characters.   

1 ACCEPTED SOLUTION
v-sihou-msft
Employee
Employee

@mbegg

 

Since there's no logic on the difference between those "very-close" duplicates, it's not possible to identify those duplicates via Power Query/DAX. I suggest you try some Text Analysis API to achieve your goal.

 

Regards,

View solution in original post

3 REPLIES 3
v-sihou-msft
Employee
Employee

@mbegg

 

Since there's no logic on the difference between those "very-close" duplicates, it's not possible to identify those duplicates via Power Query/DAX. I suggest you try some Text Analysis API to achieve your goal.

 

Regards,

This isn't a solution! While Fuzzy match in PBI has been great, it doesn't handle fuzzy duplicates in a single column and therefore this post is not solved.

@v-sihou-msft

 

I now know that what I was trying to describe is called "fuzzy match" in the data analytics space. I will add this as a development idea

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.