Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
adamsharma
Frequent Visitor

Find String Containing All Words in Another String

Hi there,

 

Driving myself to distraction trying to figure this out so would appreciate some help.

 

My situation is I have two tables which contain columns with strings. I want to take the values from the column on the first table and perform a lookup to see if there is a row in the column on the second table which contains ALL of the individual words in each string, in any position (but ideally only if they appear in the same order as in the original string). The match doesn't need to be case-sensitive.

 

For example:

 

Table 1:

 

idstringMatch in Table 2?
1Test stringY
2

Another test string

Y

3

Yet another test string

N

4

This string

Y

5

String this

N

 

Table 2:

 

idstring
AThis is a test string
BThis is another test string

 

I have searched far and wide trying to find a solution but hitting a brick wall. Would really appreciate some help!

 

Thanks,

 

Adam

1 ACCEPTED SOLUTION

I think I've got it. Try substituting in this expression:

 

if  List.Count(
        List.Select(
            Table2[string],
            each List.Accumulate(
                    Text.Split(Text.Lower(_), " "),
                    Text.Split(Text.Lower(row[string]), " "),
                    (s, c) => if s <> {} and c = s{0}
                              then List.Skip(s)
                              else s
                 ) = {}
        )
    ) > 0
then "Y"
else "N"

 

The logic here is that it iterates through the Table2 string word by word deleting the first word of the Table1 string if it matches that word. Any Table1 string that gets entirely deleted by this iterative process is one that has all the matches in the right order.

View solution in original post

8 REPLIES 8
mahoneypat
Employee
Employee

This was an interesting one, so I played with it too. Here is another approach that seems to work. FYI that I converted both [string] columns to lowercase first (but you could add that into the expression too instead, if desired).

 

 

let
  Source = Table.FromRows(
    Json.Document(
      Binary.Decompress(
        Binary.FromText(
          "i45WMlTSUQpJLS5RKC4pysxLV4rViVYyAoo55uWXZKQWKZSgyRkD5SJTSxQSccibgMzLyCxGFjMFigWDuQolQCml2FgA",
          BinaryEncoding.Base64
        ),
        Compression.Deflate
      )
    ),
    let
      _t = ((type nullable text) meta [Serialized.Text = true])
    in
      type table [id = _t, string = _t]
  ),
  #"Changed Type" = Table.TransformColumnTypes(Source, {{"id", Int64.Type}, {"string", type text}}),
  #"Lowercased Text" = Table.TransformColumns(#"Changed Type", {{"string", Text.Lower, type text}}),
  #"Added Custom" = Table.AddColumn(
    #"Lowercased Text",
    "Custom",
    each
      let
        thisphrase = [string],
        thislist = Text.Split(thisphrase, " "),
        othertablelist = Table2[string],
        selrows = List.Select(
          othertablelist,
          each Text.Contains(
            Text.Combine(List.Intersect({Text.Split(_, " "), thislist}), " "),
            thisphrase
          )
        ),
        result = if List.Count(selrows) > 0 then "Y" else "N"
      in
        result
  )
in
  #"Added Custom"

 

 

mahoneypat_0-1643147862561.png

 

 

Pat





Did I answer your question? Mark my post as a solution! Kudos are also appreciated!

To learn more about Power BI, follow me on Twitter or subscribe on YouTube.


@mahoneypa HoosierBI on YouTube


This is pretty clever. There are some cases where I think it will fail though. For example, "and this string" wouldn't match with "this string and this other string" since the concatenation of the intersection would be "this string and".

 

I'm not positive but you might be able to resolve this by taking the intersection in the reverse order since it appears that the sort order of the first argument in List.Intersect is the one that is preserved.

AlexisOlson
Super User
Super User

I haven't yet solved the case where order matters but you can use List.ContainsAll to check if all of the words in Table1[string] match a row in Table2[string].

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUQpJLS5RKC4pysxLV4rViVYyAoo55uWXZKQWKZSgyRkD5SJTSxQSccibgMzLyCxGFjMFigWDuQolQCml2FgA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [id = _t, string = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"id", Int64.Type}, {"string", type text}}),
    #"Added Custom" =
        Table.AddColumn(
            #"Changed Type", "Custom",
            (row) =>
                if  List.Count(
                        List.Select(
                            Table2[string],
                            each List.ContainsAll(
                                    Text.Split(Text.Lower(_), " "),
                                    Text.Split(Text.Lower(row[string]), " ")
                                 )
                        )
                    ) > 0
                then "Y"
                else "N",
            type text
        )
in
    #"Added Custom"

AlexisOlson_0-1643050477171.png

Hi Alexis,

 

That is brilliant, thank you!! Even without the check for order of words, that gets me an awful lot closer to what I'm looking for. I will endeavour to learn more about the various List functions as it looks like they hold the key to resolving quite a few headaches I encounter.

 

Much appreciated!

I think I've got it. Try substituting in this expression:

 

if  List.Count(
        List.Select(
            Table2[string],
            each List.Accumulate(
                    Text.Split(Text.Lower(_), " "),
                    Text.Split(Text.Lower(row[string]), " "),
                    (s, c) => if s <> {} and c = s{0}
                              then List.Skip(s)
                              else s
                 ) = {}
        )
    ) > 0
then "Y"
else "N"

 

The logic here is that it iterates through the Table2 string word by word deleting the first word of the Table1 string if it matches that word. Any Table1 string that gets entirely deleted by this iterative process is one that has all the matches in the right order.

Thanks Alexis, gave this a quick try just now and seems to work but lead to a "There weren't enough elements in the enumeration to complete the operation" error with certain combinations - I haven't had chance to have a proper look yet though, hopefully will later this week. Really appreciate your suggestions!

It needs an empty set check. I've edited my answer.

Amazing!

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors