Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
KiwiPete
Frequent Visitor

Table.FuzzyNestedJoin threshold producing inconsistent reslts

Hi All.

I have a need to join two client system extracts where the keys don’t exactly match.

Each table has PolicyNo, FirstName, LastName fields, data patterns are like

 

999991,Joe,Bloggs  --matches--  AAA999991xx,Joe,Bloggs
999991,Joe,Bloggs  --matches--  AAA999992xx,Mary,Bloggs
88188,An,Other  --matches--  BBB88188,Anne,Other
88188,John,Smith  --matches--  CCC88188,John,Smithe

 

I am getting good results using Table.FuzzyNestedJoin joining on all three fields with a Threshold setting of .7, when using a filtered test dataset (1000 records). 999991,Joe,Bloggs produces the necessary two matches AAA999991xx,Joe,Bloggs and AAA999992xx,Mary,Bloggs.

 

But when I scale up across the entire records (6K records table1, 1.5M table2) using the same code,  {999991,Joe,Bloggs} only returns the AAA999991xx,Joe,Bloggs match.

 

Reducing the threshold to .5 produces many false matches e.g. {999991,Joe,Bloggs} returns additional hits such as BBB999991xx,Will,Robinson and 999991,Zachary,Smith but the important AAA999992xx,Mary,Bloggs match is missing.

Reducing the threshold further to say .3, {999991,Joe,Bloggs} produces a large number of false matches, but does produce the important AAA999992xx,Mary,Bloggs match.

 

So questions are:

  1.   Is this to be expected? Does the threshold somehow work like an average across the entire record set so each record is compared against a balance of the other records?
  2. Is there a way to return the threshold score for each match – this would help determine the right threshold setting?
  3. Is there another approach that maybe I should be taking?

Thanks for your help

1 REPLY 1
v-kelly-msft
Community Support
Community Support

Hi @KiwiPete ,

 

Check the reference below:

https://www.poweredsolutions.co/2019/03/26/fuzzy-matching-in-power-bi-power-query/

 

Best Regards,
Kelly

Did I answer your question? Mark my post as a solution!

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors