Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi there,
Driving myself to distraction trying to figure this out so would appreciate some help.
My situation is I have two tables which contain columns with strings. I want to take the values from the column on the first table and perform a lookup to see if there is a row in the column on the second table which contains ALL of the individual words in each string, in any position (but ideally only if they appear in the same order as in the original string). The match doesn't need to be case-sensitive.
For example:
Table 1:
id | string | Match in Table 2? |
1 | Test string | Y |
2 | Another test string | Y |
3 | Yet another test string | N |
4 | This string | Y |
5 | String this | N |
Table 2:
id | string |
A | This is a test string |
B | This is another test string |
I have searched far and wide trying to find a solution but hitting a brick wall. Would really appreciate some help!
Thanks,
Adam
Solved! Go to Solution.
I think I've got it. Try substituting in this expression:
if List.Count(
List.Select(
Table2[string],
each List.Accumulate(
Text.Split(Text.Lower(_), " "),
Text.Split(Text.Lower(row[string]), " "),
(s, c) => if s <> {} and c = s{0}
then List.Skip(s)
else s
) = {}
)
) > 0
then "Y"
else "N"
The logic here is that it iterates through the Table2 string word by word deleting the first word of the Table1 string if it matches that word. Any Table1 string that gets entirely deleted by this iterative process is one that has all the matches in the right order.
This was an interesting one, so I played with it too. Here is another approach that seems to work. FYI that I converted both [string] columns to lowercase first (but you could add that into the expression too instead, if desired).
let
Source = Table.FromRows(
Json.Document(
Binary.Decompress(
Binary.FromText(
"i45WMlTSUQpJLS5RKC4pysxLV4rViVYyAoo55uWXZKQWKZSgyRkD5SJTSxQSccibgMzLyCxGFjMFigWDuQolQCml2FgA",
BinaryEncoding.Base64
),
Compression.Deflate
)
),
let
_t = ((type nullable text) meta [Serialized.Text = true])
in
type table [id = _t, string = _t]
),
#"Changed Type" = Table.TransformColumnTypes(Source, {{"id", Int64.Type}, {"string", type text}}),
#"Lowercased Text" = Table.TransformColumns(#"Changed Type", {{"string", Text.Lower, type text}}),
#"Added Custom" = Table.AddColumn(
#"Lowercased Text",
"Custom",
each
let
thisphrase = [string],
thislist = Text.Split(thisphrase, " "),
othertablelist = Table2[string],
selrows = List.Select(
othertablelist,
each Text.Contains(
Text.Combine(List.Intersect({Text.Split(_, " "), thislist}), " "),
thisphrase
)
),
result = if List.Count(selrows) > 0 then "Y" else "N"
in
result
)
in
#"Added Custom"
Pat
To learn more about Power BI, follow me on Twitter or subscribe on YouTube.
This is pretty clever. There are some cases where I think it will fail though. For example, "and this string" wouldn't match with "this string and this other string" since the concatenation of the intersection would be "this string and".
I'm not positive but you might be able to resolve this by taking the intersection in the reverse order since it appears that the sort order of the first argument in List.Intersect is the one that is preserved.
I haven't yet solved the case where order matters but you can use List.ContainsAll to check if all of the words in Table1[string] match a row in Table2[string].
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUQpJLS5RKC4pysxLV4rViVYyAoo55uWXZKQWKZSgyRkD5SJTSxQSccibgMzLyCxGFjMFigWDuQolQCml2FgA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [id = _t, string = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"id", Int64.Type}, {"string", type text}}),
#"Added Custom" =
Table.AddColumn(
#"Changed Type", "Custom",
(row) =>
if List.Count(
List.Select(
Table2[string],
each List.ContainsAll(
Text.Split(Text.Lower(_), " "),
Text.Split(Text.Lower(row[string]), " ")
)
)
) > 0
then "Y"
else "N",
type text
)
in
#"Added Custom"
Hi Alexis,
That is brilliant, thank you!! Even without the check for order of words, that gets me an awful lot closer to what I'm looking for. I will endeavour to learn more about the various List functions as it looks like they hold the key to resolving quite a few headaches I encounter.
Much appreciated!
I think I've got it. Try substituting in this expression:
if List.Count(
List.Select(
Table2[string],
each List.Accumulate(
Text.Split(Text.Lower(_), " "),
Text.Split(Text.Lower(row[string]), " "),
(s, c) => if s <> {} and c = s{0}
then List.Skip(s)
else s
) = {}
)
) > 0
then "Y"
else "N"
The logic here is that it iterates through the Table2 string word by word deleting the first word of the Table1 string if it matches that word. Any Table1 string that gets entirely deleted by this iterative process is one that has all the matches in the right order.
Thanks Alexis, gave this a quick try just now and seems to work but lead to a "There weren't enough elements in the enumeration to complete the operation" error with certain combinations - I haven't had chance to have a proper look yet though, hopefully will later this week. Really appreciate your suggestions!
It needs an empty set check. I've edited my answer.
Amazing!
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.