Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Jqarroyo
Frequent Visitor

Fuzzy Matching - Scores & Size limits

Hello,


I'm testing the fuzzy matching merge option and got a couple of queries:

 

- Is there any way of getting the scores on the output returned? The Fuzzy lookup addon (Excel) had this possibility which I think is very convenient in case you want to review the accuracy of the results over the similarity threshold.

- Have you tried this tool on big data sets? I'm quite impressed on how the algorithm is working with a couple of small examples I've tested but seems is taking ages to run for a bigger connection (1M rows approx.) Any parameter I can change to make it work faster?

 

The logic behind this tool is something that could have a very positive impact on one of the projects I'm working so any experience, feedback on it would be more than welcome!

 

Many thanks, 

 

Jq

2 REPLIES 2
v-eachen-msft
Community Support
Community Support

Hi @Jqarroyo ,

 

To make it work faster, you can update to the latest version. The speed of the fuzzy matching has been optimized in the April version.

And you may refer to document below about how to improve the performance in power query:

https://docs.microsoft.com/en-us/power-bi/power-bi-reports-performance

 

Community Support Team _ Eads
If this post helps, then please consider Accept it as the solution to help the other members find it.

 

Hi @v-eachen-msft 


Thanks for your response.

I've got the August release of Power Bi Desktop so I believe it should contain the most recent version of Fuzzy Matching.


My connections/tables are quite clean as I covered anything re filters/trim/cleaning etc in a different model so still don't get my head around the slow performance...is it 1M-2M rows too much to run this tool? I’m running a I7 with 16Gb RAM so it’s not a very slow machine.

The excel add-on (Fuzzy Lookup) is slow and obviously has the limitations of the size...but I would expect "Fuzzy Matching" via Power BI to perform faster and better in larger data sets..

I'm going to have a look to that post in more detail. Very interesting!

Thanks again for your help.

 

Jq

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.