Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
Anonymous
Not applicable

How to extract 5/6 length ID numbers from a text string?

Hi,

I'm searching for a best-practice method to solve the following problem. I'm trying to extract a 5/6 length ID number from the following Page column.


I want to extract the 5 & 6 length ID numbers from the page column for each unique URL and return the result in the VacancyId column.

Is there a best-practice method for this?

2 ACCEPTED SOLUTIONS

@Anonymous

 

Try this..

 

I think it will be close.
Please see the attached file as well

Tricks I learnt from @ImkeFSmiley Wink
But I believe she still would have a better solution

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45W0i9LTE4sKS1K1Tc0MDc0NdXNTU1JLS/KTi2yjyk1MDBONjI0NiwCM1ON9YxLS42MTdLS4u2VYnWQNcN1QU0hTzNCmxFMm76hhaGJqbE+adrhtupCtOuSq93YxAhFL8R7SrGxAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Page = _t]),
    ChangedType = Table.TransformColumnTypes(Source,{{"Page", type text}}),
    CharactersToReplace = List.RemoveItems(List.Transform({1..126}, each Character.FromNumber(_)),{"0".."9"}),
    CharactersToReplaceWith=List.Repeat({"|"},List.Count(CharactersToReplace)),
    MyList=List.Zip({CharactersToReplace,CharactersToReplaceWith}),
    Convert=Table.AddColumn(ChangedType, "Custom1", each Text.Combine(
                            List.ReplaceMatchingItems
                            (Text.ToList([Page]),
                            MyList)
                            )),
    Convert2=Table.AddColumn(Convert, "Custom2", each Text.Combine(
                            List.Select(
                            List.RemoveItems(
                            Text.Split([Custom1],"|")
                            ,
                            {null,""})
                            ,
                            each Text.Length(_)>=5)
                            ,
                            ","
                            ))
in
    Convert2

extract.png


Regards
Zubair

Please try my custom visuals

View solution in original post

That's a pretty nifty solution @Zubair_Muhammad!

 

Using Text.SplitAny, the code can be shortened considerably:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45W0i9LTE4sKS1K1Tc0MDc0NdXNTU1JLS/KTi2yjyk1MDBONjI0NiwCM1ON9YxLS42MTdLS4u2VYnWQNcN1QU0hTzNCmxFMm76hhaGJqbE+adrhtupCtOuSq93YxAhFL8R7SrGxAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Page = _t]),
    ChangedType = Table.TransformColumnTypes(Source,{{"Page", type text}}),
    CharactersToReplace = Text.Combine(List.RemoveItems(List.Transform({1..126}, each Character.FromNumber(_)),{"0".."9"})),
    Convert=Table.AddColumn(ChangedType, "Custom1", each Text.SplitAny(
                            [Page],
                            CharactersToReplace)
                            ),
    Convert2=Table.AddColumn(Convert, "Custom2", each List.Select(
[Custom1],
(li) => Text.Length(li) >=5){0}
)
in
    Convert2

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

View solution in original post

7 REPLIES 7
Greg_Deckler
Super User
Super User

I don't see any real pattern to where those numbers appear.


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Mastering Power BI 2nd Edition

DAX is easy, CALCULATE makes DAX hard...
Anonymous
Not applicable

Hi Greg,

Thnx for your reply. Correct, the pattern is very diffent in the Url's.

I'm thinking of a process like this;

Var VacancyId =

//1 List of possible delimiters 
del = {" ".."/",":".."@","[".."'","{".."~", "#(cr)", "#(lf)"},         

//2 List of possible delimiters
delreplacewith = List.Transform(del, each {_, " "}),

//3 Match each delimiter with blank space
delreplacewith = List.Transform(del, each {_, " "}),

//4 Replace each delimiter with blank space
delreplaced = List.ReplaceMatchingItems(Text.ToList(text), delreplacewith), 

//5 split the text by blank space
tolist = Text.Split(backtotext, " "),

//6 Find the first value in the string

//7 If this value  is of length 5, then select this value

//8 If this value <> length 5, then find the second value

//9 If this value  is of length 5, then select this value

//10 Etc.

Return
Var VacancyId

Am I thinking in the right way? Are there better solutions?

@Anonymous

 

Try this..

 

I think it will be close.
Please see the attached file as well

Tricks I learnt from @ImkeFSmiley Wink
But I believe she still would have a better solution

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45W0i9LTE4sKS1K1Tc0MDc0NdXNTU1JLS/KTi2yjyk1MDBONjI0NiwCM1ON9YxLS42MTdLS4u2VYnWQNcN1QU0hTzNCmxFMm76hhaGJqbE+adrhtupCtOuSq93YxAhFL8R7SrGxAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Page = _t]),
    ChangedType = Table.TransformColumnTypes(Source,{{"Page", type text}}),
    CharactersToReplace = List.RemoveItems(List.Transform({1..126}, each Character.FromNumber(_)),{"0".."9"}),
    CharactersToReplaceWith=List.Repeat({"|"},List.Count(CharactersToReplace)),
    MyList=List.Zip({CharactersToReplace,CharactersToReplaceWith}),
    Convert=Table.AddColumn(ChangedType, "Custom1", each Text.Combine(
                            List.ReplaceMatchingItems
                            (Text.ToList([Page]),
                            MyList)
                            )),
    Convert2=Table.AddColumn(Convert, "Custom2", each Text.Combine(
                            List.Select(
                            List.RemoveItems(
                            Text.Split([Custom1],"|")
                            ,
                            {null,""})
                            ,
                            each Text.Length(_)>=5)
                            ,
                            ","
                            ))
in
    Convert2

extract.png


Regards
Zubair

Please try my custom visuals

That's a pretty nifty solution @Zubair_Muhammad!

 

Using Text.SplitAny, the code can be shortened considerably:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45W0i9LTE4sKS1K1Tc0MDc0NdXNTU1JLS/KTi2yjyk1MDBONjI0NiwCM1ON9YxLS42MTdLS4u2VYnWQNcN1QU0hTzNCmxFMm76hhaGJqbE+adrhtupCtOuSq93YxAhFL8R7SrGxAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Page = _t]),
    ChangedType = Table.TransformColumnTypes(Source,{{"Page", type text}}),
    CharactersToReplace = Text.Combine(List.RemoveItems(List.Transform({1..126}, each Character.FromNumber(_)),{"0".."9"})),
    Convert=Table.AddColumn(ChangedType, "Custom1", each Text.SplitAny(
                            [Page],
                            CharactersToReplace)
                            ),
    Convert2=Table.AddColumn(Convert, "Custom2", each List.Select(
[Custom1],
(li) => Text.Length(li) >=5){0}
)
in
    Convert2

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

This solution is great.. I am using it for a different application.
I have one column containing 4 and 5 digit numbers that I need to grab and add to a new column. This same column also has 10 digit numbers (in the same column as the 4 and 5 digit numbers) that I need to also grab and add to another column.
So, current column contains 4/5 digits as well as 10 digit numbers.... 4/5 digits I want to extract into a column.... 10 digits I want to extract into another column.  
The solution you posted above works well in extracting them, except they are all in the same column... I can then do another transform from there. However, if the list doesn't contain either... I get "Errors"  --- what is the best way for me to replace such cases with "null" if no match is found for >=4 ?

Anonymous
Not applicable

Thanks very much 🙂

@ImkeF
@Zubair_Muhammad

Both the solutions give the result I was looking for.



It also works on 5 and 6+ digit ID’s, so it gives exactly the result I was looking for.

I’m using these solutions to solve a slightly more complicated PBI problem.
I will post a new message for that one. 

Anonymous
Not applicable

Hi,

I'm searching for a best-practice method to solve the following problem. I'm trying to extract a specific 4/5/6 length ID number for each line in the following Page column.


I want to extract the correct 4/5/6 length ID number for each line in the Page column and return the result in the VacancyId column.

Is there a best-practice way for this?

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.