Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Anonymous
Not applicable

Extract the number and text from String

Hello All,

 

I have certain requirement where I have to find the floor no. based on the given Text.

Problem is given Text is not in the same format and also sometimes no number in the text, In that case need to find the middle two letter from the text.

 

For example:

 

VH 1. OG L1
VH 1. OG R1
VH 2. OG L2
VH 2. OG R2
VH 3. OG L3
VH DG LDG
VH DG RDG
HH EG LEG
HH EG L HEG
HH EG R VEG
1.OG 4. WHG1
2. OG 1. WHG2
2. OG 2. WHG2
2. OG 3. WHG2
2. OG 4. WHG2
3. OG 1. WHG3
3. OG 2. WHG3
3. OG 3. WHG3
SFL 1. OG li1
SFL 1. OG re1
SFL 2. OG li2
SFL 2. OG re2

SFL 3. OG li

3

1 ACCEPTED SOLUTION

Hi @Anonymous,

 

That is due to delimiters. The code expects a period and/or a blank space. For TE74,12,EG re,102 the result is re,102 a space is found in between EG and re but did not consider the comma as a delimiter and so Text.Split was not able to produce the desired result.

 

Try the modified code below. If this still doesn't work, please provide a table that best represents how bad your data is. That way, the community can come up with a better solution.

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("bZLLasNADEV/ZfBaDNHDSbuP61kECm5xFyGrEmggqzT/T2bkeRovfY50pRE+n7vZGbTmczRTBx12F1BEik4eUYumGnGu4oSOy/dxrMBUgHNmWCqGBhi3RpOZC0Lr54g1P27MWy77YIJUQdqCvAWlhdxmcgVpC3ILvz5O8Zj3W96zwMe1gVQqaQW1skAulcug7+EggARDKAXcUTmVVxzV/eYVNooABXqrI2CvfX3lwBSHtcPc50NXbldn9m2mdwz+ykmGRkkSSIBjpqhKr9u/h0yJrm0LLmWqpFq+hUaO80RdDj1ktxo46x9H9nH9/Xv+x4NdXg==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Test = _t, #"Desired Result" = _t]),
    #"Added Custom" = Table.AddColumn(Source, "Replace and Split", 
            each let 
                del = {" ".."/",":".."@","[".."'","{".."~", "#(cr)", "#(lf)"}, //list of possible delimiters based on ASCII table plus the new line separators
                delreplacewith = List.Transform(del, each {_, " "}), //match each delimiter with blank space
                delreplaced = List.ReplaceMatchingItems(Text.ToList([Test]), delreplacewith), //replace each delimiter with blank space
                backtotext =  Text.Combine(delreplaced) //convert the list back to text
            in Text.Split(backtotext, " "), type list), //split text by blank space

     //1 - check if each value in column "Replace and Split" can be succesfully converted to number otherwise null, 2 - remove the nulls, 3 -return index 0 if there is a value else null
    #"Added Custom1" = Table.AddColumn(#"Added Custom", "Number?", each try ( List.Select(List.Transform([Replace and Split], each try Number.FromText(_) otherwise null), each _ <> null){0} ) otherwise null, Int64.Type),
    
    //if "Number?" is null get the second item from "Replace and Split" otherwise null
    #"Added Custom2" = Table.AddColumn(#"Added Custom1", "Text?", each if [#"Number?"]= null then [Replace and Split]{1} else null, type text),

    //and finally the floor nuber, if "Number?" is not null convert it to text and if null get the value from "Text?"
    #"Added Custom3" = Table.AddColumn(#"Added Custom2", "Floor Number", each if [#"Number?"] = null then [#"Text?"] else Number.ToText([#"Number?"]), type text)
in
    #"Added Custom3"

 

I have also converted the  code above to a custom function. Place the code in a blank query and name it as desired. You can use the function by either using Invoke Custom Function feature or by creating a custom column. The function has only one parameter and that is the text you want to extract the floor number/letter from.

 

let 

func = (text as text) as any=>

    let
        //list of possible delimiters based on ASCII table plus the new line separators
        del = {" ".."/",":".."@","[".."'","{".."~", "#(cr)", "#(lf)"},         

        //match each delimiter with blank space
        delreplacewith = List.Transform(del, each {_, " "}), 

        //replace each delimiter with blank space
        delreplaced = List.ReplaceMatchingItems(Text.ToList(text), delreplacewith),  
        
        //convert the list back to text, all delimiters now replaced with blank space
        backtotext =  Text.Combine(delreplaced),

        //split the text by blank space
        tolist = Text.Split(backtotext, " "),

        //check if a value in the list is a number and then select the first number
        number = try List.Select(List.Transform(tolist, each try Number.FromText(_) otherwise null), each _ <> null){0} otherwise null,

        //if does not contain a number, get the second text string  - index position 1
        notnumber = if number = null then tolist{1} else number

in notnumber 

in func

 










Did I answer your question? Mark my post as a solution!


Proud to be a Super User!









"Tell me and I’ll forget; show me and I may remember; involve me and I’ll understand."
Need Power BI consultation, get in touch with me on LinkedIn or hire me on UpWork.
Learn with me on YouTube @DAXJutsu or follow my page on Facebook @DAXJutsuPBI.

View solution in original post

18 REPLIES 18
danextian
Super User
Super User

Hi @Anonymous

 

If I understand correctly and based on the sample data, the first numbers found is the floor number but if there is no number then that is the middle texts after space/period? If so, try this query:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("bdCxCoQwDAbgVwmdpdDER1CbQRB64A3i6CA43fsPV1ptGumYj5/kJ9tmVgZnYfEQTGec2btEmGiOhJpCTVRS9NCQ58FXEASYYcyJUQHwmwKsQs7GO72FL/vSMvdxD2KF2EJqYa+R9E6qEFtIGj/TfD/zOktPwd+hECWJL0xJQZJkPLT/AQ==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Test = _t, #"Desired Result" = _t]),
    
    //1 - replace period with space, 2 - split the new value by space, 3 - select values that are not blank
    #"Added Custom" = Table.AddColumn(Source, "Replace and Split", each List.Select(Text.Split(Text.Replace([Test], ".", " "), " "), each _ <> ""), type list),

    //1 - check if each value in column "Replace and Split" can be succesfully converted to number otherwise null, 2 - remove the nulls, 3 -return index 0 if there is a value else null
    #"Added Custom1" = Table.AddColumn(#"Added Custom", "Number?", each try ( List.Select(List.Transform([Replace and Split], each try Number.FromText(_) otherwise null), each _ <> null){0} ) otherwise null, Int64.Type),
    
    //if "Number?" is null get the second item from "Replace and Split" otherwise null
    #"Added Custom2" = Table.AddColumn(#"Added Custom1", "Text?", each if [#"Number?"]= null then [Replace and Split]{1} else null, type text),

    //and finally the floor nuber, if "Number?" is not null convert it to text and if null get the value from "Text?"
    #"Added Custom3" = Table.AddColumn(#"Added Custom2", "Floor Number", each if [#"Number?"] = null then [#"Text?"] else Number.ToText([#"Number?"]), type text)
in
    #"Added Custom3"

 

To easily understand the script, I've divided into  into several steps instead of nesting them all in one formula.










Did I answer your question? Mark my post as a solution!


Proud to be a Super User!









"Tell me and I’ll forget; show me and I may remember; involve me and I’ll understand."
Need Power BI consultation, get in touch with me on LinkedIn or hire me on UpWork.
Learn with me on YouTube @DAXJutsu or follow my page on Facebook @DAXJutsuPBI.
Anonymous
Not applicable

Hi @danextian , Your solution is quite good. But somehow my dataset is so bad that it will not solve all the problem. But thanks a lot for helping me out. 

For example: 

DesiredYour resultlage
EGre,102TE74,12,EG re,102
EGli,101TE73,12,EG li,101
5OGTE72,14,5.OG re,602
55TE72,1, 5.OG re,601
5OGTE71,14,5.OG li,601
5OGTE70,14,5.OG re,502
44TE70,13, 4.OG re,501
3OGTE7,24,3.OG li,401
4OGTE69,14,4.OG li,501
44TE69,13, 4.OG li,502
3OGTE68,14,3.OG re,402
3OGTE67,14,3.OG li,401
EG2VH EG 2.rechts

 

And I also noticed that when lage = null then it results into error, as in the ss.

Capture.PNG

 

 

 

 

Hi @Anonymous,

 

That is due to delimiters. The code expects a period and/or a blank space. For TE74,12,EG re,102 the result is re,102 a space is found in between EG and re but did not consider the comma as a delimiter and so Text.Split was not able to produce the desired result.

 

Try the modified code below. If this still doesn't work, please provide a table that best represents how bad your data is. That way, the community can come up with a better solution.

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("bZLLasNADEV/ZfBaDNHDSbuP61kECm5xFyGrEmggqzT/T2bkeRovfY50pRE+n7vZGbTmczRTBx12F1BEik4eUYumGnGu4oSOy/dxrMBUgHNmWCqGBhi3RpOZC0Lr54g1P27MWy77YIJUQdqCvAWlhdxmcgVpC3ILvz5O8Zj3W96zwMe1gVQqaQW1skAulcug7+EggARDKAXcUTmVVxzV/eYVNooABXqrI2CvfX3lwBSHtcPc50NXbldn9m2mdwz+ykmGRkkSSIBjpqhKr9u/h0yJrm0LLmWqpFq+hUaO80RdDj1ktxo46x9H9nH9/Xv+x4NdXg==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Test = _t, #"Desired Result" = _t]),
    #"Added Custom" = Table.AddColumn(Source, "Replace and Split", 
            each let 
                del = {" ".."/",":".."@","[".."'","{".."~", "#(cr)", "#(lf)"}, //list of possible delimiters based on ASCII table plus the new line separators
                delreplacewith = List.Transform(del, each {_, " "}), //match each delimiter with blank space
                delreplaced = List.ReplaceMatchingItems(Text.ToList([Test]), delreplacewith), //replace each delimiter with blank space
                backtotext =  Text.Combine(delreplaced) //convert the list back to text
            in Text.Split(backtotext, " "), type list), //split text by blank space

     //1 - check if each value in column "Replace and Split" can be succesfully converted to number otherwise null, 2 - remove the nulls, 3 -return index 0 if there is a value else null
    #"Added Custom1" = Table.AddColumn(#"Added Custom", "Number?", each try ( List.Select(List.Transform([Replace and Split], each try Number.FromText(_) otherwise null), each _ <> null){0} ) otherwise null, Int64.Type),
    
    //if "Number?" is null get the second item from "Replace and Split" otherwise null
    #"Added Custom2" = Table.AddColumn(#"Added Custom1", "Text?", each if [#"Number?"]= null then [Replace and Split]{1} else null, type text),

    //and finally the floor nuber, if "Number?" is not null convert it to text and if null get the value from "Text?"
    #"Added Custom3" = Table.AddColumn(#"Added Custom2", "Floor Number", each if [#"Number?"] = null then [#"Text?"] else Number.ToText([#"Number?"]), type text)
in
    #"Added Custom3"

 

I have also converted the  code above to a custom function. Place the code in a blank query and name it as desired. You can use the function by either using Invoke Custom Function feature or by creating a custom column. The function has only one parameter and that is the text you want to extract the floor number/letter from.

 

let 

func = (text as text) as any=>

    let
        //list of possible delimiters based on ASCII table plus the new line separators
        del = {" ".."/",":".."@","[".."'","{".."~", "#(cr)", "#(lf)"},         

        //match each delimiter with blank space
        delreplacewith = List.Transform(del, each {_, " "}), 

        //replace each delimiter with blank space
        delreplaced = List.ReplaceMatchingItems(Text.ToList(text), delreplacewith),  
        
        //convert the list back to text, all delimiters now replaced with blank space
        backtotext =  Text.Combine(delreplaced),

        //split the text by blank space
        tolist = Text.Split(backtotext, " "),

        //check if a value in the list is a number and then select the first number
        number = try List.Select(List.Transform(tolist, each try Number.FromText(_) otherwise null), each _ <> null){0} otherwise null,

        //if does not contain a number, get the second text string  - index position 1
        notnumber = if number = null then tolist{1} else number

in notnumber 

in func

 










Did I answer your question? Mark my post as a solution!


Proud to be a Super User!









"Tell me and I’ll forget; show me and I may remember; involve me and I’ll understand."
Need Power BI consultation, get in touch with me on LinkedIn or hire me on UpWork.
Learn with me on YouTube @DAXJutsu or follow my page on Facebook @DAXJutsuPBI.

This worked great for the first set of numbers with a result of 38439.

 

38439 females  35683 males   0 new  74559 continuing   13973 pupils receiving take-home rations;  55892 family members of 13973 pupils who received take-home rations (assuming four family members per pupil);  222 teachers who received training;  60 civil servants trained;  3975 PTA and community members trained in child health and nutrition

 

How would I modify the code to get the all the numbers split in separate columns?

Anonymous
Not applicable

Hey @danextian , thanks a lot man. I will give it a shot and revert you back.

Hardik
Continued Contributor
Continued Contributor
Anonymous
Not applicable

@HardikProblem is it's not always no. I have to find. I have to look for String if no number present in it.

Hardik
Continued Contributor
Continued Contributor

https://support.office.com/en-us/article/split-a-column-of-text-power-query-5282d425-6dd0-46ca-95bf-...

you can split the data in columns as the room numbers are the second ones after space.So by splitting them on the basis of spaces,I think you will get the room numbers

Anonymous
Not applicable

@Hardik But there are many where room no. starts in the beginning. Please check the example.

Hi @Anonymous,

 

Are your numbers less than 10 everytime?

 

I called the first colums [column1] and with Power Query I used this formula for creating this column.

 

Please check and come back to us it it does not work:

if Text.PositionOf([Column1],".")=-1 then Text.Middle([Column1],Text.PositionOf([Column1]," "),3) else if Text.PositionOf([Column1],".")=1 then Text.Start([Column1],1) else Text.Middle([Column1],Text.PositionOf([Column1]," "),2)

 

Hope it helps...

 

Ninter

Anonymous
Not applicable

Hi @Interkoubess No, I have some numbers which are more than 10. Does it gonna change in your query ?

Hi @Anonymous,

 

Yeah it won't work. I will figure out how to deal with all kind of numbers.

 

I will let you know if anybody did not answer until then...

 

 

Ninter

Hi @Anonymous,

 

Please try this formula ( I made the assertions about the dot and space) for creating a column in Power Query:

 

if Text.Length([Column1])-Text.Length(Text.Replace([Column1],".",""))=0 then Text.BetweenDelimiters([Column1]," "," ") else if Text.Length([Column1])-Text.Length(Text.Replace([Column1],".",""))=1 then Text.BetweenDelimiters([Column1]," ",".") else Text.Start([Column1],Text.PositionOf([Column1],"."))

The column1 is the column where you are your raw data.

 

Hope it helps.

 

Ninter

 

 

Anonymous
Not applicable

Hi @Interkoubess ,

 

I applied your formula, it is working for most of the cases but sometimes it is not giving me the desired result. Like for example.

 

Desired ResultYour Query Resultlage
1li1.OG li
1re1.OG re
DGmiDG mi
DGreDG re
DGSpitzbodenDG Spitzboden

 

 

Please give my code  a try.

 










Did I answer your question? Mark my post as a solution!


Proud to be a Super User!









"Tell me and I’ll forget; show me and I may remember; involve me and I’ll understand."
Need Power BI consultation, get in touch with me on LinkedIn or hire me on UpWork.
Learn with me on YouTube @DAXJutsu or follow my page on Facebook @DAXJutsuPBI.
Anonymous
Not applicable

Hi @danextian , That's what I am trying right now. Sorry for late response.

Anonymous
Not applicable

Hi @Interkoubess Ninter,

 

thank you for the response. but I am still confused how to use this in power BI desktop.

HI @Anonymous,

 

A

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.