Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Anonymous
Not applicable

Extract 5 Consecutive Numbers from Text String

Hello Everyone,

 

This is my first post so please let me know if I need to change anything.  I have a need to extract a 5 digit number from a text string.  Unfortunately there wasn't a specific format set for this string so the 5 digits I need appear in different locations of the string.  I have provided an example below of what I need.  What is consistent is that there is only ever 5 consecutive numbers once per string and they always follow a period.  

 

Starting StringExtract
C402.asdfl.43582.243582
F423D.6.62418.LJ486462418
LDOFK.48673.5.KGFO48673
G453.14162.9.LDKFI14162
C532.15354GHC.215354

 

I have tried a variety of methods but so far no luck.  If you have any ideas, please let me know.  

1 ACCEPTED SOLUTION
BA_Pete
Super User
Super User

Hi @Anonymous ,

 

Paste this code over the default code in a new blank query to see the steps I took to complete this:

let
    removeChars = List.Transform({65..90, 97..122}, each Character.FromNumber(_)),
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("HY1LCgIxEETv0uuhMP2JcZ2QqAnMAYYsBHHlzvuDmd71q1dFHwdlvTBev/fnCxVLDKaN/KK5HVSVpSAisoaE8dQUdRUcvTDKXjtWfBUYeqv7uT/RdVMTBA2RccMovT6WdnadTRjBxLTds792oDn/", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Starting String" = _t, Extract = _t]),
    chgTypes = Table.TransformColumnTypes(Source,{{"Starting String", type text}, {"Extract", Int64.Type}}),
    addCalcOutput = Table.AddColumn(chgTypes, "calcOutput", each Text.Remove([Starting String], removeChars)),
    splitByDelimiter = Table.ExpandListColumn(Table.TransformColumns(addCalcOutput, {{"calcOutput", Splitter.SplitTextByDelimiter(".", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "calcOutput"),
    filterLengthFive = Table.SelectRows(splitByDelimiter, each Text.Length([calcOutput]) = 5)
in
    filterLengthFive

 

Summary:

1) Create list of all letters, lower and upper case.

2) Use list to remove all text characters from starting string.

3) Split resulting values by "." delimiter into new rows.

4) Filter column on value length = 5.

 

This gives me the following output ([calcOutput] column):

BA_Pete_0-1643651003975.png

 

Pete



Now accepting Kudos! If my post helped you, why not give it a thumbs-up?

Proud to be a Datanaut!




View solution in original post

4 REPLIES 4
BA_Pete
Super User
Super User

Hi @Anonymous ,

 

Paste this code over the default code in a new blank query to see the steps I took to complete this:

let
    removeChars = List.Transform({65..90, 97..122}, each Character.FromNumber(_)),
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("HY1LCgIxEETv0uuhMP2JcZ2QqAnMAYYsBHHlzvuDmd71q1dFHwdlvTBev/fnCxVLDKaN/KK5HVSVpSAisoaE8dQUdRUcvTDKXjtWfBUYeqv7uT/RdVMTBA2RccMovT6WdnadTRjBxLTds792oDn/", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Starting String" = _t, Extract = _t]),
    chgTypes = Table.TransformColumnTypes(Source,{{"Starting String", type text}, {"Extract", Int64.Type}}),
    addCalcOutput = Table.AddColumn(chgTypes, "calcOutput", each Text.Remove([Starting String], removeChars)),
    splitByDelimiter = Table.ExpandListColumn(Table.TransformColumns(addCalcOutput, {{"calcOutput", Splitter.SplitTextByDelimiter(".", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "calcOutput"),
    filterLengthFive = Table.SelectRows(splitByDelimiter, each Text.Length([calcOutput]) = 5)
in
    filterLengthFive

 

Summary:

1) Create list of all letters, lower and upper case.

2) Use list to remove all text characters from starting string.

3) Split resulting values by "." delimiter into new rows.

4) Filter column on value length = 5.

 

This gives me the following output ([calcOutput] column):

BA_Pete_0-1643651003975.png

 

Pete



Now accepting Kudos! If my post helped you, why not give it a thumbs-up?

Proud to be a Datanaut!




Anonymous
Not applicable

Thanks @BA_Pete ! I used this to extract times written as a string of four consecutive digits from a column of long-form text strings.


The changes I had to make:

  • changing source to an existing table from another query
  • space as delimiter instead of "."
  • need to exclude other non-numeric characters from the string

My adapted code below in case it's useful for future readers.

 

= let
    removeChars = List.Transform({33..47, 58..90, 97..122}, each Character.FromNumber(_)),
    Source = My_Source,
    chgTypes = Table.TransformColumnTypes(Source,{{"My_Text_Field", type text}, {"My_Output_Column", Int64.Type}}),
    addCalcOutput = Table.AddColumn(chgTypes, "calcOutput", each Text.Remove([My_Text_Field], removeChars)),
    splitByDelimiter = Table.ExpandListColumn(Table.TransformColumns(addCalcOutput, {{"calcOutput", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "calcOutput"),
    filterLengthFour = Table.SelectRows(splitByDelimiter, each Text.Length([calcOutput]) = 4)
in
    filterLengthFour

 

Pedantic suggestion:

Whitelisting digits rather than blacklisting letters is slightly simpler and slightly more robust.

 

That is, instead of Text.Remove([Starting String], removeChars), use Text.Select([Starting String], {".", "0..9"}).

Anonymous
Not applicable

This worked perfectly!  Thank you!

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors