Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi All.
I have a need to join two client system extracts where the keys don’t exactly match.
Each table has PolicyNo, FirstName, LastName fields, data patterns are like
999991,Joe,Bloggs --matches-- AAA999991xx,Joe,Bloggs
999991,Joe,Bloggs --matches-- AAA999992xx,Mary,Bloggs
88188,An,Other --matches-- BBB88188,Anne,Other
88188,John,Smith --matches-- CCC88188,John,Smithe
I am getting good results using Table.FuzzyNestedJoin joining on all three fields with a Threshold setting of .7, when using a filtered test dataset (1000 records). 999991,Joe,Bloggs produces the necessary two matches AAA999991xx,Joe,Bloggs and AAA999992xx,Mary,Bloggs.
But when I scale up across the entire records (6K records table1, 1.5M table2) using the same code, {999991,Joe,Bloggs} only returns the AAA999991xx,Joe,Bloggs match.
Reducing the threshold to .5 produces many false matches e.g. {999991,Joe,Bloggs} returns additional hits such as BBB999991xx,Will,Robinson and 999991,Zachary,Smith but the important AAA999992xx,Mary,Bloggs match is missing.
Reducing the threshold further to say .3, {999991,Joe,Bloggs} produces a large number of false matches, but does produce the important AAA999992xx,Mary,Bloggs match.
So questions are:
Thanks for your help
Hi @KiwiPete ,
Check the reference below:
https://www.poweredsolutions.co/2019/03/26/fuzzy-matching-in-power-bi-power-query/
Best Regards,
Kelly
Did I answer your question? Mark my post as a solution!
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.