Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
cparrot
Helper I
Helper I

Fuzzy Matching list size limit?

1. two different lists. keywords list is less than 67 rows.1. two different lists. keywords list is less than 67 rows.2. pbi matches words @ .5 threshold2. pbi matches words @ .5 threshold3. added 4 new words to list3. added 4 new words to list4. fails to match ANY words after refresh4. fails to match ANY words after refreshHello, i am fuzzy matching in pbi, pulling a list of keywords from excel, then comparing that list of keywords to another excel list of long strings of text. it works fine when the list of keywords is small, but when i add more than ~67 keywords to the list, the entire matching fails to capture ANY matches. but as soon as i remove enough keywords to less than 67 it works fine again.

 

is anyone else getting a similar issue? ive even tried adding the keywords directly into pbi internal table and still it fails when i add more than 67 rows. 

13 REPLIES 13
Ashish_Mathur
Super User
Super User

Hi,

In each row of the Test string column, can multiple keywords be found or will only one keyword be found in each row?


Regards,
Ashish Mathur
http://www.ashishmathur.com
https://www.linkedin.com/in/excelenthusiasts/

Hello, mulitple keywords can be found per row.

Hi,

Please share data in a format that can be pasted in an MS Excel file.


Regards,
Ashish Mathur
http://www.ashishmathur.com
https://www.linkedin.com/in/excelenthusiasts/

cat
horse
dog
string
yarn
fence
goat
baby
town
home
beaver
cow
trailer
wheels
rims
stock
buy
tune
song
yellow
green
blue
red
black
brown
tan
gray
orange
white
blue
tulip
boat
flower
rainbow
car
ramp
bridge
plane
jet
hoverboard
clock
bear
beard
prince
king
skunk
ship
water
sea
sky
pond
fish
lake
ocean
continent
path
road
brighter
taller
shorter
similar
noticibly
fast
slow
stopped
pants
shirt
garage
pavement
chowder
pb&j

 

 

 

 

test string
With this utility you generate a 16 character output based on your input of numbers and upper and lower case letters.  Random strings can be unique. Used in computing, a random string generator can also be called a random character string generator. This is an important tool if you want to generate a unique set of strings. The utility generates a sequence that lacks a pattern and is random.
Based on your input, get a horse random alpha numeric string. The random string generator creates a series of numbers and letters that have no pattern. These can be helpful for creating security codes.
Possible applications for a random string generator could be for statistical sampling,
 simulations, and cryptography.  For security cat reasons, a random string generator can be useful. 
The stench from the feedlot permeated the yellow car despite having the air conditioning on recycled air.
Truth in advertising and dinosaurs with skateboards have much in common.
The beach was crowded with snow leopards and a car.
Improve your goldfish's physical fitness by getting him a bicycle or a plane.
I often see the time 11:11 or 12:34 on clocks.
They were excited to see their first sloth.
Thirty years later, she still thought it was okay to put the toilet paper roll under rather than over.
Check back tomorrow; I will see if the book has arrived.Dog
They called out her name boat time and again, but were met with nothing but silence.
Mary realized if her calculator had a history, it would be more embarrassing than her computer browser history.

Hi,

You may download my PBI file from here.

Hope this helps.

Untitled.png


Regards,
Ashish Mathur
http://www.ashishmathur.com
https://www.linkedin.com/in/excelenthusiasts/

Hello and thank you for your help with fuzzy matching alternative. however, it appears that a couple of false positives were flagged and im wondering how to make it more accurate?

 

cparrot_0-1663074385822.png

 

 

Hi,

I do not understand our doubt.  In the screenshot which i shared with you, Cat and tan appear in the strings very clearly.


Regards,
Ashish Mathur
http://www.ashishmathur.com
https://www.linkedin.com/in/excelenthusiasts/

cat is flagged in two strings, but is actually only present in 1 of the strings.

tan is flagged, but does not appear in the string from which it is flagged.

matching.png

 

Cat appears in the word applications.


Regards,
Ashish Mathur
http://www.ashishmathur.com
https://www.linkedin.com/in/excelenthusiasts/

Hi Ashish, thank you for your assist with this power bi. however, i am running into a 3000 row limit when copying and pasting into table1. i need to be able to pull from an external excel spreadsheet with 100,000+ rows of data. however, i get an error when using the steps you provided. what needs to be altered in the code so that both keywords and text can be pulled from excel spreadsheets?

cparrot_2-1696515005535.png

 i changed the code to look for [Name] rather than [Text], but the result is empty

cparrot_3-1696515103621.png

 

Hi,

Do not paste the data into a Table.  Instead, import data from the Excel file.


Regards,
Ashish Mathur
http://www.ashishmathur.com
https://www.linkedin.com/in/excelenthusiasts/
parry2k
Super User
Super User

@cparrot how you are doing the fuzzy matching?



Subscribe to the @PowerBIHowTo YT channel for an upcoming video on List and Record functions in Power Query!!

Learn Power BI and Fabric - subscribe to our YT channel - Click here: @PowerBIHowTo

If my solution proved useful, I'd be delighted to receive Kudos. When you put effort into asking a question, it's equally thoughtful to acknowledge and give Kudos to the individual who helped you solve the problem. It's a small gesture that shows appreciation and encouragement! ❤


Did I answer your question? Mark my post as a solution. Proud to be a Super User! Appreciate your Kudos 🙂
Feel free to email me with any of your BI needs.

String text list  left join to keywords list (to show all the strings, but only the keywords which are successfully found).

.50 threshold 

ignoring case

not combining text parts

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.