cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Giovanni-PBI Regular Visitor
Regular Visitor

How to extract 5/6 length ID numbers from a text string?

Hi,

I'm searching for a best-practice method to solve the following problem. I'm trying to extract a 5/6 length ID number from the following Page column.


I want to extract the 5 & 6 length ID numbers from the page column for each unique URL and return the result in the VacancyId column.

Is there a best-practice method for this?

2 ACCEPTED SOLUTIONS

Accepted Solutions
Super User
Super User

Re: How to extract 5/6 length ID numbers from a text string?

@Giovanni-PBI

 

Try this..

 

I think it will be close.
Please see the attached file as well

Tricks I learnt from @ImkeFSmiley Wink
But I believe she still would have a better solution

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45W0i9LTE4sKS1K1Tc0MDc0NdXNTU1JLS/KTi2yjyk1MDBONjI0NiwCM1ON9YxLS42MTdLS4u2VYnWQNcN1QU0hTzNCmxFMm76hhaGJqbE+adrhtupCtOuSq93YxAhFL8R7SrGxAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Page = _t]),
    ChangedType = Table.TransformColumnTypes(Source,{{"Page", type text}}),
    CharactersToReplace = List.RemoveItems(List.Transform({1..126}, each Character.FromNumber(_)),{"0".."9"}),
    CharactersToReplaceWith=List.Repeat({"|"},List.Count(CharactersToReplace)),
    MyList=List.Zip({CharactersToReplace,CharactersToReplaceWith}),
    Convert=Table.AddColumn(ChangedType, "Custom1", each Text.Combine(
                            List.ReplaceMatchingItems
                            (Text.ToList([Page]),
                            MyList)
                            )),
    Convert2=Table.AddColumn(Convert, "Custom2", each Text.Combine(
                            List.Select(
                            List.RemoveItems(
                            Text.Split([Custom1],"|")
                            ,
                            {null,""})
                            ,
                            each Text.Length(_)>=5)
                            ,
                            ","
                            ))
in
    Convert2

extract.png

Super User
Super User

Re: How to extract 5/6 length ID numbers from a text string?

That's a pretty nifty solution @Zubair_Muhammad!

 

Using Text.SplitAny, the code can be shortened considerably:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45W0i9LTE4sKS1K1Tc0MDc0NdXNTU1JLS/KTi2yjyk1MDBONjI0NiwCM1ON9YxLS42MTdLS4u2VYnWQNcN1QU0hTzNCmxFMm76hhaGJqbE+adrhtupCtOuSq93YxAhFL8R7SrGxAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Page = _t]),
    ChangedType = Table.TransformColumnTypes(Source,{{"Page", type text}}),
    CharactersToReplace = Text.Combine(List.RemoveItems(List.Transform({1..126}, each Character.FromNumber(_)),{"0".."9"})),
    Convert=Table.AddColumn(ChangedType, "Custom1", each Text.SplitAny(
                            [Page],
                            CharactersToReplace)
                            ),
    Convert2=Table.AddColumn(Convert, "Custom2", each List.Select(
[Custom1],
(li) => Text.Length(li) >=5){0}
)
in
    Convert2

 

Did I answer your question? Mark my post as a solution!

Proud to be a Datanaut!

Imke Feldmann

How to integrate M-code into your solution -- Check out more PBI- learning resources here -- Performance Tipps for M-queries




6 REPLIES 6
Super User
Super User

Re: How to extract 5/6 length ID numbers from a text string?

I don't see any real pattern to where those numbers appear.


Did I answer your question? Mark my post as a solution!

Proud to be a Datanaut!


Giovanni-PBI Regular Visitor
Regular Visitor

Extracting specific numbers from text strings

Hi,

I'm searching for a best-practice method to solve the following problem. I'm trying to extract a specific 4/5/6 length ID number for each line in the following Page column.


I want to extract the correct 4/5/6 length ID number for each line in the Page column and return the result in the VacancyId column.

Is there a best-practice way for this?

Giovanni-PBI Regular Visitor
Regular Visitor

Re: How to extract 5/6 length ID numbers from a text string?

Hi Greg,

Thnx for your reply. Correct, the pattern is very diffent in the Url's.

I'm thinking of a process like this;

Var VacancyId =

//1 List of possible delimiters 
del = {" ".."/",":".."@","[".."'","{".."~", "#(cr)", "#(lf)"},         

//2 List of possible delimiters
delreplacewith = List.Transform(del, each {_, " "}),

//3 Match each delimiter with blank space
delreplacewith = List.Transform(del, each {_, " "}),

//4 Replace each delimiter with blank space
delreplaced = List.ReplaceMatchingItems(Text.ToList(text), delreplacewith), 

//5 split the text by blank space
tolist = Text.Split(backtotext, " "),

//6 Find the first value in the string

//7 If this value  is of length 5, then select this value

//8 If this value <> length 5, then find the second value

//9 If this value  is of length 5, then select this value

//10 Etc.

Return
Var VacancyId

Am I thinking in the right way? Are there better solutions?

Super User
Super User

Re: How to extract 5/6 length ID numbers from a text string?

@Giovanni-PBI

 

Try this..

 

I think it will be close.
Please see the attached file as well

Tricks I learnt from @ImkeFSmiley Wink
But I believe she still would have a better solution

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45W0i9LTE4sKS1K1Tc0MDc0NdXNTU1JLS/KTi2yjyk1MDBONjI0NiwCM1ON9YxLS42MTdLS4u2VYnWQNcN1QU0hTzNCmxFMm76hhaGJqbE+adrhtupCtOuSq93YxAhFL8R7SrGxAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Page = _t]),
    ChangedType = Table.TransformColumnTypes(Source,{{"Page", type text}}),
    CharactersToReplace = List.RemoveItems(List.Transform({1..126}, each Character.FromNumber(_)),{"0".."9"}),
    CharactersToReplaceWith=List.Repeat({"|"},List.Count(CharactersToReplace)),
    MyList=List.Zip({CharactersToReplace,CharactersToReplaceWith}),
    Convert=Table.AddColumn(ChangedType, "Custom1", each Text.Combine(
                            List.ReplaceMatchingItems
                            (Text.ToList([Page]),
                            MyList)
                            )),
    Convert2=Table.AddColumn(Convert, "Custom2", each Text.Combine(
                            List.Select(
                            List.RemoveItems(
                            Text.Split([Custom1],"|")
                            ,
                            {null,""})
                            ,
                            each Text.Length(_)>=5)
                            ,
                            ","
                            ))
in
    Convert2

extract.png

Super User
Super User

Re: How to extract 5/6 length ID numbers from a text string?

That's a pretty nifty solution @Zubair_Muhammad!

 

Using Text.SplitAny, the code can be shortened considerably:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45W0i9LTE4sKS1K1Tc0MDc0NdXNTU1JLS/KTi2yjyk1MDBONjI0NiwCM1ON9YxLS42MTdLS4u2VYnWQNcN1QU0hTzNCmxFMm76hhaGJqbE+adrhtupCtOuSq93YxAhFL8R7SrGxAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Page = _t]),
    ChangedType = Table.TransformColumnTypes(Source,{{"Page", type text}}),
    CharactersToReplace = Text.Combine(List.RemoveItems(List.Transform({1..126}, each Character.FromNumber(_)),{"0".."9"})),
    Convert=Table.AddColumn(ChangedType, "Custom1", each Text.SplitAny(
                            [Page],
                            CharactersToReplace)
                            ),
    Convert2=Table.AddColumn(Convert, "Custom2", each List.Select(
[Custom1],
(li) => Text.Length(li) >=5){0}
)
in
    Convert2

 

Did I answer your question? Mark my post as a solution!

Proud to be a Datanaut!

Imke Feldmann

How to integrate M-code into your solution -- Check out more PBI- learning resources here -- Performance Tipps for M-queries




Highlighted
Giovanni-PBI Regular Visitor
Regular Visitor

Re: How to extract 5/6 length ID numbers from a text string?

Thanks very much Smiley Happy

@ImkeF
@Zubair_Muhammad

Both the solutions give the result I was looking for.



It also works on 5 and 6+ digit ID’s, so it gives exactly the result I was looking for.

I’m using these solutions to solve a slightly more complicated PBI problem.
I will post a new message for that one.