Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Dropsix
New Member

Pulling a product number with varying lengths from a text field and inputting it into a new column

Good afternoon!  

 

I have a general text field that could contain a paragraph of text and within it, will contain a code that I would like to extract and enter into a new column.

 

For example, I would be looking for " FKR-FY22-614 ". This can change in length based on the last numbers so like FKR-FY22-52 or FKR-FY23-1040 eventually. The one constant is it always starts with FKR-FY

 

I have found some solutions but the code gets complicated and I have a hard time understanding how to apply it in my situation. 

 

Thanks in advance!! 

1 ACCEPTED SOLUTION

To deal with text before the product number, my solution needs to be tweaked a bit. Try this as a custom column definition:

let
  after_prefix = Text.AfterDelimiter([Column1], "FKR-FY" ),
  remove_digits = Text.Remove(after_prefix, {"0".."9","-"}),
  delimiter = Text.Start(remove_digits, 1),
  trim_start = Text.AfterDelimiter([Column1], "FKR-FY"),
  prod_num = Text.BeforeDelimiter(trim_start, delimiter),
  result = if delimiter = ""
           then "FKR-FY" & trim_start
           else "FKR-FY" & prod_num
in
  result

View solution in original post

16 REPLIES 16
Dropsix
New Member

I'll test a few things tonight and tomorrow morning and come back and mark a solution. 

Thanks!

 

 

AlexisOlson
Super User
Super User

If you have a list of characters to split on (space, period, etc.), then you write a custom column with Text.SplitAny:

List.First(Text.SplitAny([Column1], " ."))

AlexisOlson_0-1675370235494.png

 

A more robust solution might be to look for the first character after the number after the second hyphen and then return everything before that character.

let
  after_hyphens = Text.AfterDelimiter([Column1], "-", 1),
  remove_digits = Text.Remove(after_hyphens, {"0".."9"}),
  delimiter = Text.Start(remove_digits, 1),
  prod_num = Text.BeforeDelimiter([Column1], delimiter),
  result = if delimiter = ""
           then [Column1]
           else prod_num
in
  result

That's awesome, I will give that a shot. I like the idea of going in reverse too, that does make sense.

AlB
Super User
Super User

Hi @Dropsix 

If I understand correctly you are looking for strings that start with "FKR-FY". What is the rule to know where the string ends? Is it at the next space?

Can you provide an example with a full paragraph of text?

 

SU18_powerbi_badge

Please accept the solution when done and consider giving a thumbs up if posts are helpful. 

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

 

Sometimes it's the next space, but sometimes I have seen periods right after.

So for example it could be: 

FKR-FY22-42 was delivered via courier, contents were not verified etc.

 

Or 

 

FKR-FY23-1003. delivered via courier, contents were not verified etc.

@Dropsix 

You´ll have to be more specific about the requirements to get a more robust solution. 

Place the following M code in a blank query to see the steps.

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("rc5BCoMwFATQqwxZG7GxeAQ33XVXggvRKX4oCSS/6vEN4glKdwMzPMZ7o9wV/eNp+5dz9u6wjRkzP7IyccYqI6b4TcJUlRCUQTO20iFERRnJW8qOOpmh8uaSWntrmrb+DapPKerCBGVWnB87/MkeDg==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),

    #"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each "FKR-FY" & Text.AfterDelimiter([Column1], "FKR-FY"), type text),
    chars_ = {"0".."9", "a".."z", "A".."Z", "-"},  
    #"Split Column by Character Transition" = Table.SplitColumn(#"Added Custom", "Custom", Splitter.SplitTextByCharacterTransition(chars_, (c) => not List.Contains(chars_, c)), {"Custom"})
in
    #"Split Column by Character Transition"

 

SU18_powerbi_badge

Please accept the solution when done and consider giving a thumbs up if posts are helpful. 

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

 

To be more specific, I am looking for any mention of FKR and the letters and number that follow it within a free text field. Sometimes it's at the beginning of the field, sometimes in the middle sometimes at the end followed by a period. The criteria is that I find anything that looks like FKR-FY**-** or FKR-FY**-*** or FKR-FY**-**** 

 

I know from a data standpoint this seems unreliable (what if someone doesn't copy and paste that properly or type it in properly) but this is for my reference only and not used in any official way. So I hope that is specific enough as can be, it's just kind of random. 

 

Here are a couple of examples:

 

FKR-FY21-554.
Due to COVID-19, containers were not opened to verify their contents.
Containers were damaged during transport and had to be re-taped.

 

 

"Delivery Company is UPS Express.
FKR-FY19-112."

To deal with text before the product number, my solution needs to be tweaked a bit. Try this as a custom column definition:

let
  after_prefix = Text.AfterDelimiter([Column1], "FKR-FY" ),
  remove_digits = Text.Remove(after_prefix, {"0".."9","-"}),
  delimiter = Text.Start(remove_digits, 1),
  trim_start = Text.AfterDelimiter([Column1], "FKR-FY"),
  prod_num = Text.BeforeDelimiter(trim_start, delimiter),
  result = if delimiter = ""
           then "FKR-FY" & trim_start
           else "FKR-FY" & prod_num
in
  result

hi, exactly what i am looking for, but my case is a bit different. i am trying to extract check numbers of variable length from a text string. the check number can be anywhere and either starts with CHK# or CHK#(blank space). i included some examples. how would i tweak abover code?

 

PAY LINES :1 DD/Reversed Info Payment Chk#342693295 Dt: 04/11/2023
PAY LINES :3-6 BSWETHA CHK#823361000136002  Payment transfer from Grp4Inv#80151559 to Grp100Inv#8144886
PAY LINES :1 PG BC RUN#5973 5974 FB CHK#9024748351 OFFSET
PAY LINES :2-3 PG BC RUN#5973 5974 FB CHK#9024748355 OFFSET
PAY LINES :1 AB/REVERSED INFO PAYMENT CHK#432592773 CHK Date#12/19/2023
PAY LINES :1,3-5 BS/AETNA RUN#19617-19618 WO CHK#992401101036992 OFFSET
PAY LINES :1 PG UHC RUN#9911 9912 WO CHK#1TZ68486268 OFFSET
PAY LINES :1 NREDDY CHK#823354000345182   Trnasfer from Grp4 Inv#77521546 to Grp100 Inv#81810034
PAY LINES :1-2,4-5 PG/COM PAY RUN#1890 CHK#6621406
CHK# 037-0193283221 TYPE: 971 01/19/24
CHK# 003837 TYPE: LKB 01/08/24

i forgot to mention the check number can be a combination of numbers and letters.

figured it out

Hi @meierli, I know that you have found solution already, but this could be also the way:

 

Result

dufoq3_0-1711378710095.png

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("jVLbattAEP2VIXq10Mzs3W+Stb6QRBay0uCGPBgqEwiRiyIC/fuupObaGvoy7M7OmZlzzt7dXZTpHq42hd/BnCDPk6p5abrn5gds2uMJysOvp6btYfHwGAnJ2gl2CvJ+DigTooSRxcX97FMbEWvIdre+XqewWF9GloXQhIgkNCLDW9O+O7TPx6aDY3d6glX3U27al8giKVLKQX8acgE4ZklKa/XXWQTlCrIFVDdFpJwREIKEZTYOdsjSSCsUwXa53Pn6K5pj8V94dQZPkGZJ5b/5audz2BTLLYTXa1/UI14KVo5N6BpukB/6JiJOyP1TNZqJWAXdktTXRTouRE6TiYdo4XY7reRYIhESCh3OZ/cKrG7WEy3niCAEfu1B9XdtpdWs7Vl8Ufk837/Zp2SwT0hFNtgHddcePvsGg0XGKCYl9btxMDlnaQD/NSTmmQyUy1Wy2F4Pwk2krcNxrtZMEifHhzugMDFS+IFWMBPU+9LPwRkCpFFU+aEUhRXmT8nVZTaUoB1L7n8D", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
    #"Ad_CHK#" = Table.AddColumn(Source, "CHK#", each Text.BeforeDelimiter(Text.Trim(Text.Range([Column1], Text.PositionOf([Column1], "CHK#", Occurrence.First, Comparer.OrdinalIgnoreCase)+4)), " "), type text)
in
    #"Ad_CHK#"

Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

I think that's about it. I swapped out then "FKR-FY" to null instead and it looks perfect.

 

Thank you so much, I like this. I like AIB's solution also but frankly don't quite understand how it works. 

@Dropsix 

1. The  #"Added Custom" step does a bit of initial cleaning by removing whatever is before "FKR-FY"

2.  From your requirements description (albeit not fully unambiguous), I surmise that the substring we are interested in extracting starts by "FKR-FY" and ends by any character that is not alphanumeric nor "-". From the step in point 1, we know the string starts by "FKR-FY so step " #"Split Column by Character Transition" looks for the position where there is a transition between the "allowed" characters (defined in variable chars_) and other characters. That is the end of the substring of interest and the string is split there. The piece before the split is kept.     

SU18_powerbi_badge

Please accept the solution when done and consider giving a thumbs up if posts are helpful. 

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

 

Ok that makes sense. I'll take another look at it

I like this and how it works when the text doesn't start with the product number! I also considered a character transition but didn't think of using the optional column parameter in Table.SplitColumn to just get a single column.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors