Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Good afternoon!
I have a general text field that could contain a paragraph of text and within it, will contain a code that I would like to extract and enter into a new column.
For example, I would be looking for " FKR-FY22-614 ". This can change in length based on the last numbers so like FKR-FY22-52 or FKR-FY23-1040 eventually. The one constant is it always starts with FKR-FY
I have found some solutions but the code gets complicated and I have a hard time understanding how to apply it in my situation.
Thanks in advance!!
Solved! Go to Solution.
To deal with text before the product number, my solution needs to be tweaked a bit. Try this as a custom column definition:
let
after_prefix = Text.AfterDelimiter([Column1], "FKR-FY" ),
remove_digits = Text.Remove(after_prefix, {"0".."9","-"}),
delimiter = Text.Start(remove_digits, 1),
trim_start = Text.AfterDelimiter([Column1], "FKR-FY"),
prod_num = Text.BeforeDelimiter(trim_start, delimiter),
result = if delimiter = ""
then "FKR-FY" & trim_start
else "FKR-FY" & prod_num
in
result
I'll test a few things tonight and tomorrow morning and come back and mark a solution.
Thanks!
If you have a list of characters to split on (space, period, etc.), then you write a custom column with Text.SplitAny:
List.First(Text.SplitAny([Column1], " ."))
A more robust solution might be to look for the first character after the number after the second hyphen and then return everything before that character.
let
after_hyphens = Text.AfterDelimiter([Column1], "-", 1),
remove_digits = Text.Remove(after_hyphens, {"0".."9"}),
delimiter = Text.Start(remove_digits, 1),
prod_num = Text.BeforeDelimiter([Column1], delimiter),
result = if delimiter = ""
then [Column1]
else prod_num
in
result
That's awesome, I will give that a shot. I like the idea of going in reverse too, that does make sense.
Hi @Dropsix
If I understand correctly you are looking for strings that start with "FKR-FY". What is the rule to know where the string ends? Is it at the next space?
Can you provide an example with a full paragraph of text?
|
Please accept the solution when done and consider giving a thumbs up if posts are helpful. Contact me privately for support with any larger-scale BI needs, tutoring, etc. |
Sometimes it's the next space, but sometimes I have seen periods right after.
So for example it could be:
FKR-FY22-42 was delivered via courier, contents were not verified etc.
Or
FKR-FY23-1003. delivered via courier, contents were not verified etc.
You´ll have to be more specific about the requirements to get a more robust solution.
Place the following M code in a blank query to see the steps.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("rc5BCoMwFATQqwxZG7GxeAQ33XVXggvRKX4oCSS/6vEN4glKdwMzPMZ7o9wV/eNp+5dz9u6wjRkzP7IyccYqI6b4TcJUlRCUQTO20iFERRnJW8qOOpmh8uaSWntrmrb+DapPKerCBGVWnB87/MkeDg==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each "FKR-FY" & Text.AfterDelimiter([Column1], "FKR-FY"), type text),
chars_ = {"0".."9", "a".."z", "A".."Z", "-"},
#"Split Column by Character Transition" = Table.SplitColumn(#"Added Custom", "Custom", Splitter.SplitTextByCharacterTransition(chars_, (c) => not List.Contains(chars_, c)), {"Custom"})
in
#"Split Column by Character Transition"
|
Please accept the solution when done and consider giving a thumbs up if posts are helpful. Contact me privately for support with any larger-scale BI needs, tutoring, etc. |
To be more specific, I am looking for any mention of FKR and the letters and number that follow it within a free text field. Sometimes it's at the beginning of the field, sometimes in the middle sometimes at the end followed by a period. The criteria is that I find anything that looks like FKR-FY**-** or FKR-FY**-*** or FKR-FY**-****
I know from a data standpoint this seems unreliable (what if someone doesn't copy and paste that properly or type it in properly) but this is for my reference only and not used in any official way. So I hope that is specific enough as can be, it's just kind of random.
Here are a couple of examples:
FKR-FY21-554. Due to COVID-19, containers were not opened to verify their contents. Containers were damaged during transport and had to be re-taped. |
"Delivery Company is UPS Express. FKR-FY19-112." |
To deal with text before the product number, my solution needs to be tweaked a bit. Try this as a custom column definition:
let
after_prefix = Text.AfterDelimiter([Column1], "FKR-FY" ),
remove_digits = Text.Remove(after_prefix, {"0".."9","-"}),
delimiter = Text.Start(remove_digits, 1),
trim_start = Text.AfterDelimiter([Column1], "FKR-FY"),
prod_num = Text.BeforeDelimiter(trim_start, delimiter),
result = if delimiter = ""
then "FKR-FY" & trim_start
else "FKR-FY" & prod_num
in
result
hi, exactly what i am looking for, but my case is a bit different. i am trying to extract check numbers of variable length from a text string. the check number can be anywhere and either starts with CHK# or CHK#(blank space). i included some examples. how would i tweak abover code?
PAY LINES :1 DD/Reversed Info Payment Chk#342693295 Dt: 04/11/2023 |
PAY LINES :3-6 BSWETHA CHK#823361000136002 Payment transfer from Grp4Inv#80151559 to Grp100Inv#8144886 |
PAY LINES :1 PG BC RUN#5973 5974 FB CHK#9024748351 OFFSET |
PAY LINES :2-3 PG BC RUN#5973 5974 FB CHK#9024748355 OFFSET |
PAY LINES :1 AB/REVERSED INFO PAYMENT CHK#432592773 CHK Date#12/19/2023 |
PAY LINES :1,3-5 BS/AETNA RUN#19617-19618 WO CHK#992401101036992 OFFSET |
PAY LINES :1 PG UHC RUN#9911 9912 WO CHK#1TZ68486268 OFFSET |
PAY LINES :1 NREDDY CHK#823354000345182 Trnasfer from Grp4 Inv#77521546 to Grp100 Inv#81810034 |
PAY LINES :1-2,4-5 PG/COM PAY RUN#1890 CHK#6621406 |
CHK# 037-0193283221 TYPE: 971 01/19/24 |
CHK# 003837 TYPE: LKB 01/08/24 |
i forgot to mention the check number can be a combination of numbers and letters.
figured it out
Hi @meierli, I know that you have found solution already, but this could be also the way:
Result
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("jVLbattAEP2VIXq10Mzs3W+Stb6QRBay0uCGPBgqEwiRiyIC/fuupObaGvoy7M7OmZlzzt7dXZTpHq42hd/BnCDPk6p5abrn5gds2uMJysOvp6btYfHwGAnJ2gl2CvJ+DigTooSRxcX97FMbEWvIdre+XqewWF9GloXQhIgkNCLDW9O+O7TPx6aDY3d6glX3U27al8giKVLKQX8acgE4ZklKa/XXWQTlCrIFVDdFpJwREIKEZTYOdsjSSCsUwXa53Pn6K5pj8V94dQZPkGZJ5b/5audz2BTLLYTXa1/UI14KVo5N6BpukB/6JiJOyP1TNZqJWAXdktTXRTouRE6TiYdo4XY7reRYIhESCh3OZ/cKrG7WEy3niCAEfu1B9XdtpdWs7Vl8Ufk837/Zp2SwT0hFNtgHddcePvsGg0XGKCYl9btxMDlnaQD/NSTmmQyUy1Wy2F4Pwk2krcNxrtZMEifHhzugMDFS+IFWMBPU+9LPwRkCpFFU+aEUhRXmT8nVZTaUoB1L7n8D", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
#"Ad_CHK#" = Table.AddColumn(Source, "CHK#", each Text.BeforeDelimiter(Text.Trim(Text.Range([Column1], Text.PositionOf([Column1], "CHK#", Occurrence.First, Comparer.OrdinalIgnoreCase)+4)), " "), type text)
in
#"Ad_CHK#"
I think that's about it. I swapped out then "FKR-FY" to null instead and it looks perfect.
Thank you so much, I like this. I like AIB's solution also but frankly don't quite understand how it works.
1. The #"Added Custom" step does a bit of initial cleaning by removing whatever is before "FKR-FY"
2. From your requirements description (albeit not fully unambiguous), I surmise that the substring we are interested in extracting starts by "FKR-FY" and ends by any character that is not alphanumeric nor "-". From the step in point 1, we know the string starts by "FKR-FY so step " #"Split Column by Character Transition" looks for the position where there is a transition between the "allowed" characters (defined in variable chars_
) and other characters. That is the end of the substring of interest and the string is split there. The piece before the split is kept.
|
Please accept the solution when done and consider giving a thumbs up if posts are helpful. Contact me privately for support with any larger-scale BI needs, tutoring, etc. |
Ok that makes sense. I'll take another look at it
I like this and how it works when the text doesn't start with the product number! I also considered a character transition but didn't think of using the optional column parameter in Table.SplitColumn to just get a single column.
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.