Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
PhilC
Resolver I
Resolver I

Keyword search to return exact word match

I would like to extract the keywords that are matched between two columns of text in a dataset, but to have an exact match on the full word only.

 

See the linked Excel file, the output desired for UID 2 is only "software", not "war, software", and UID 3 would have no match for "drive" in the Abstract as the word there is "driveway"

Keyword Example in Excel

 

The example includes keywords “war” and “software”.  War should not be considered a match for software.

 

Using a List approach I can return a TRUE / FALSE for the exact match, but I also want to show the user the keywords that were matched.

#"Added List of Title and Abstract Edited" = Table.AddColumn(#"Inserted Merged Column", "Title and Abstract Edited as List", each Text.Split([Title and Abstract Edited]," ")),
    #"Find Keywords in Title and Abstract List" = Table.AddColumn(#"Added List of Title and Abstract Edited", "Title and Abstract Edited Keyword Match", each List.ContainsAny([Title and Abstract Edited as List], KeywordsList)),

 

I have tried two other approaches to obtain the matches but they return the false positives.

#"Added Keyword Matches as list" = Table.AddColumn(#"Filtered Rows1", "Keyword Match List", (x) => List.Select(KeywordsList, each Text.PositionOf(x[#"Title and Abstract Edited"], _) >= 0)),


#"Add Abstract Keyword Table" = Table.AddColumn(#"Extracted Values", "Keywords Found", each let CurrentText = [Abstract] in Table.SelectRows(#"Keywords In Use",each Text.Contains(CurrentText, [Keywords], Comparer.OrdinalIgnoreCase))),

Is there a way to return only exact word matches, like the List approach that returns TRUE / FALSE?

 

Also, this query takes a considerable time to run, with the processing indicating 200MB for the keywords, which are a list of 300 words, so seems like it is looping through that table many times.  Is there a better approach to eliminate this?

 

Appreciate any assistance.

 

 

1 ACCEPTED SOLUTION
ImkeF
Super User
Super User

Yes, List.Instersect would do that job here:

 

#"Added List of Title and Abstract Edited" = Table.AddColumn(#"Inserted Merged Column", "Title and Abstract Edited as List", each Text.Split([Title and Abstract Edited]," ")),
    #"Find Keywords in Title and Abstract List" = Table.AddColumn(#"Added List of Title and Abstract Edited", "Title and Abstract Edited Keyword Match", each List.Intersect( { [Title and Abstract Edited as List], KeywordsList } ))

Then you can either expand or use Text.Combine to create a string with all keywords.

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

View solution in original post

2 REPLIES 2
ImkeF
Super User
Super User

Yes, List.Instersect would do that job here:

 

#"Added List of Title and Abstract Edited" = Table.AddColumn(#"Inserted Merged Column", "Title and Abstract Edited as List", each Text.Split([Title and Abstract Edited]," ")),
    #"Find Keywords in Title and Abstract List" = Table.AddColumn(#"Added List of Title and Abstract Edited", "Title and Abstract Edited Keyword Match", each List.Intersect( { [Title and Abstract Edited as List], KeywordsList } ))

Then you can either expand or use Text.Combine to create a string with all keywords.

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Works perfectly, thank you for the quick response.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors