Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
gio1082
Helper I
Helper I

Extract part number from text

 Hi, 
I have this data from a description that I have to clean up. It's a long list.

gio1082_1-1596070594774.png

I need to extract the part number (highlighted in yellow) and put it in a single column. All the part numbers start with M follow by some numbers and a letter in between or just a number.  The end result should be like : 

gio1082_2-1596070756534.png

Thank you in advance!

2 ACCEPTED SOLUTIONS
Anonymous
Not applicable

this code aspires to manage almost all cases you showed.

 

the logic is take all the "words" that are between two of these ", # ()" [there is also a blank " "] characters that contain a digit


//add the following lines of code to your advanced editor, in the query where there is your table.
// if your last step is differente from #"Changed Type", change in the expressione that follows consequently

 

 

    #"Added Custom" = Table.AddColumn(#"Changed Type", "extract", each List.Select(Text.SplitAny([Column1],"# ,()"),(p)=> List.ContainsAny(Text.ToList(p) ,{"0".."9"}))),
    #"Expanded extract" = Table.ExpandListColumn(#"Added Custom", "extract")
in
//change also your expressione to this

    #"Expanded extract"

 

 

View solution in original post

Anonymous
Not applicable

 
 
if you want to be helped, you should find the time to be more precise and clear.
"IT" can be too many things, to get an idea of what "doesn't work".
Maybe the problem is on the name of the columns, or you and I have a different version of PBI that behaves differently on our PC.
However here is the code with the example data used and the result obtained
 
 

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMkzJT1bSUSooKFQoSi3ILypRSMsvUvA1NzMxNbZwNzUx1gFyTC1MDS2czExNdBQyDm/KObxJKVYHQytco5m5qSm6Zjew5sSkZGw6oZYG5ClDrDUwwNSaXpKSBtZrBNFLsmuNCXtUA0mvpkJaGsSxJkgaYS71g7oUrA3ZQl8Lc3OLUEsLZJtNCdsMMhBsiqWFObrDzciMoVgA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type", "extract", each List.Select(Text.SplitAny([Column2],"# ,()"),(p)=> List.ContainsAny(Text.ToList(p) ,{"0".."9"}))),
    #"Expanded extract" = Table.ExpandListColumn(#"Added Custom", "extract")
in
    #"Expanded extract"


image.png

 

 
if that's not what you want, you should indicate exactly where changes are to be made.
 

View solution in original post

12 REPLIES 12
v-alq-msft
Community Support
Community Support

Hi, @gio1082 

 

Based on your description, I created data to reproduce your scenario. The pbix file is attached in the end.

 

Table:

a1.png

 

Here are the codes in 'Advanced Editor'.

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WCggIVChKLcgvKlFIyy9S8DU3MjM3NXC2MDbWgXPMzUyVYnUgaoMgan0tLUxMLUwcDQwMdaAcUyAHizIzI2MjuDJjS0uQMmO4OiSrA/yUDQzMTIwNzYCaTEyMLBWCPf18lGJjAQ==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Text = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Text", type text}}),
    Custom1 = Table.ReplaceValue(#"Changed Type",","," ",Replacer.ReplaceText,{"Text"}),
    #"Split Column by Delimiter" = Table.SplitColumn(Custom1, "Text", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"Text.1", "Text.2", "Text.3", "Text.4", "Text.5"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Text.1", type text}, {"Text.2", type text}, {"Text.3", type text}, {"Text.4", type text}, {"Text.5", type text}}),
    Custom2 = List.Combine(
    {
    Table.Column(#"Changed Type1","Text.1"),
    Table.Column(#"Changed Type1","Text.2"),
    Table.Column(#"Changed Type1","Text.3"),
    Table.Column(#"Changed Type1","Text.4"),
    Table.Column(#"Changed Type1","Text.5")
    }
),
    #"Converted to Table" = Table.FromList(Custom2, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    Custom3 = Table.SelectRows(#"Converted to Table",each
let 
  column = [Column1],
  txt = Text.Middle([Column1],1,Text.Length([Column1])-1)
in
(Text.StartsWith(column,"M") and Text.PositionOfAny(txt,{"A".."Z","a".."z"})>=1 and Text.PositionOfAny(txt,{"A".."Z","a".."z"})<=Text.Length(txt)-2
) or
Text.Contains(column,"#")


),
    #"Added Custom" = Table.AddColumn(Custom3, "Custom", each if
Text.Contains([Column1],"#") then 
Text.Middle(
    [Column1],
    Text.PositionOf([Column1],"#")+1,
    Text.Length([Column1])-1
) 
else 
[Column1]),
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Column1"})
in
    #"Removed Columns"

 

Result:

a2.png

 

Best Regards

Allan

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

 

Anonymous
Not applicable

this code aspires to manage almost all cases you showed.

 

the logic is take all the "words" that are between two of these ", # ()" [there is also a blank " "] characters that contain a digit


//add the following lines of code to your advanced editor, in the query where there is your table.
// if your last step is differente from #"Changed Type", change in the expressione that follows consequently

 

 

    #"Added Custom" = Table.AddColumn(#"Changed Type", "extract", each List.Select(Text.SplitAny([Column1],"# ,()"),(p)=> List.ContainsAny(Text.ToList(p) ,{"0".."9"}))),
    #"Expanded extract" = Table.ExpandListColumn(#"Added Custom", "extract")
in
//change also your expressione to this

    #"Expanded extract"

 

 

thank you for all the help, sorry, for all the trouble. 

so here are all my columns in my tables

gio1082_2-1596131732580.png

This is the code before your code

gio1082_3-1596131777972.png

This is how I try to input your code 

gio1082_0-1596131458164.png

this is the error that I get (I change [column1] to [column2], but I also tried it the 1)

gio1082_1-1596131526948.png

what I am doing wrong? also I don't need the numbers that end DOC

 

thank you!

 

Anonymous
Not applicable

it seems that somewhere a comma is missing: each expression, except the last before in, must end with a comma

try adding a comma at end of expression that starts with #Changed Type".

After that, if it works, you could do some toilette to the table, promoting the first row to header role.

But this, seems, non essential for the moment.

 

 

Can you help me extract all the part numbers that start with M?

It didn't work :(, but thanks for all you help!

Anonymous
Not applicable

 
 
if you want to be helped, you should find the time to be more precise and clear.
"IT" can be too many things, to get an idea of what "doesn't work".
Maybe the problem is on the name of the columns, or you and I have a different version of PBI that behaves differently on our PC.
However here is the code with the example data used and the result obtained
 
 

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMkzJT1bSUSooKFQoSi3ILypRSMsvUvA1NzMxNbZwNzUx1gFyTC1MDS2czExNdBQyDm/KObxJKVYHQytco5m5qSm6Zjew5sSkZGw6oZYG5ClDrDUwwNSaXpKSBtZrBNFLsmuNCXtUA0mvpkJaGsSxJkgaYS71g7oUrA3ZQl8Lc3OLUEsLZJtNCdsMMhBsiqWFObrDzciMoVgA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type", "extract", each List.Select(Text.SplitAny([Column2],"# ,()"),(p)=> List.ContainsAny(Text.ToList(p) ,{"0".."9"}))),
    #"Expanded extract" = Table.ExpandListColumn(#"Added Custom", "extract")
in
    #"Expanded extract"


image.png

 

 
if that's not what you want, you should indicate exactly where changes are to be made.
 

this worked!!!

Greg_Deckler
Super User
Super User

@gio1082 - Well, for the M ones you could do something like this:

 

Column = MID([Column],SEARCH(" M",[Column])+ 1, 11)

@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Mastering Power BI 2nd Edition

DAX is easy, CALCULATE makes DAX hard...

It gave me an error 😞

gio1082_0-1596122549763.png

 

Anonymous
Not applicable

from what little you see in the scree shot, I assume that the strings to be extracted are those that contain an "M" or a "#" and are between a space and a comma.

if not exactly like that, explain in detail the various situations that occur
 

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WKigoVChKLcgvKlFIyy9S8DU3MzE1tnA3NTHWAXJMLUwNLZzMTE10FDIOb8o5vEkpVgehB67DzNzUFF2XG1hXYlIyihaoNQF5yhCLDAww9aSXpKSBNZHtMAJ60tKQHAVzkR/URWRbADIBrMvSwpxMhyH0xAIA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type", "extract", each List.Select(Text.SplitAny([Column1]," ,"),(p)=> Text.Contains(p,"M") or Text.Contains(p,"#"))),
    #"Expanded extract" = Table.ExpandListColumn(#"Added Custom", "extract")
in
    #"Expanded extract"

 

Hi, 

Well, in general cases it looks like this.

gio1082_2-1596122956770.png

but sometimes it will be in between parenthesis

gio1082_3-1596123081148.png

other before a #

gio1082_4-1596123164421.png

 

Also not sure how to use your code. Do  I use it as a new column? what are the items I need to change to make it work for mine?

 

thanks

 

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors