Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
PhilC
Resolver I
Resolver I

Split text based on every second delimiter

Hi, looking for assistance with manipulating columns in Power Query.

 

I need to split a column based on a Comma delimiter, however I need to only split every second one.  The data is formatted Surname, Initials with a Comma delimiter between people.  There are multiple people in each row (can be between 1 and 1,000 values in each row).  I am not looking to split into columns as part of this process.

 

I want to extract into a list, but have each Surname, Initials combination, not Surname then Initials on the next row.

 

Example Data

Person
Chapman, D.A., Baker, R., Thomas, B.L.
Davis, Q., Grant, M.R.

 

Desired Result

Person
Chapman, D.A.
Baker, R.
Thomas, B.L.
Grant, M.R.
Davis, Q.

 

Have been trying the code from here:  https://community.powerbi.com/t5/Desktop/Text-Replace-specific-character-only-if-it-occurs-as-the-la...

Table.TransformColumns(#"Changed Type",{{"Column1", each Text.Split(_," ")}})

 

Hope that explains what I am after.

 

Cheers

Phil

 

3 ACCEPTED SOLUTIONS
az38
Community Champion
Community Champion

Hi @PhilC 

try split by ".,"

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45Wcs5ILMhNzNNRcNFz1NNRcErMTi3SUQgCMkMy8nMTi4FCej56SrE60UouiWWZQH4gUM69KDGvREfBVy8IKBULAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Person = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Person", type text}}),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Changed Type", "Person", Splitter.SplitTextByDelimiter(".,", QuoteStyle.Csv), {"Person.1", "Person.2", "Person.3"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Person.1", type text}, {"Person.2", type text}, {"Person.3", type text}}),
    #"Added Index" = Table.AddIndexColumn(#"Changed Type1", "Index", 0, 1),
    #"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Added Index", {"Index"}, "Attribute", "Value"),
    #"Trimmed Text" = Table.TransformColumns(#"Unpivoted Other Columns",{{"Value", Text.Trim, type text}}),
    #"Added Custom" = Table.AddColumn(#"Trimmed Text", "Custom", each [Value]&"."),
    #"Replaced Value" = Table.ReplaceValue(#"Added Custom","..",".",Replacer.ReplaceText,{"Custom"}),
    #"Removed Other Columns" = Table.SelectColumns(#"Replaced Value",{"Custom"})
in
    #"Removed Other Columns"

do not hesitate to give a kudo to useful posts and mark solutions as solution
Linkedin


do not hesitate to give a kudo to useful posts and mark solutions as solution
LinkedIn

View solution in original post

Jimmy801
Community Champion
Community Champion

Hello @PhilC 

 

had created once a nice function that can be used here. Soltuion by @az38  is nice, but would work only in for this 3 cells. This function can be applied dynamically

(tSplitText as text, 
tDelimiter as text, 
nOccurance as nullable number) as list =>

let
    nOccuranceIntern = if nOccurance = null then 2 else nOccurance,
    Source = "Chapman, D.A., Baker, R., Thomas, B.L.",
    ListFromText = Text.ToList(tSplitText),
    CreateListOfDelimiter = List.Accumulate
        (
            ListFromText,
            [CurrentRow = -1, DelimiterFound={}],
            (state, current)=>
                if current = tDelimiter then
                        Record.TransformFields(state, {{"DelimiterFound", each _ & {state[CurrentRow]}},{"CurrentRow", each _ +1 }})
                        else
                        Record.TransformFields(state, {{"CurrentRow", each _ +1 }})
        )[DelimiterFound],
    ListAlternate = List.Alternate(CreateListOfDelimiter,nOccuranceIntern-1,nOccuranceIntern-1),
    ChangeTextList = List.Accumulate
        (
            ListAlternate,
            ListFromText,
            (state, current)=>
                List.ReplaceRange(state,current+1,1,{"%%%"})
        ),
    SplitTextByNewDelimiter = Text.Split
        (
            List.Accumulate
                (
                    ChangeTextList,
                    "",
                    (state, current)=> 
                        state & current
                ),
            "%%%"
         )


in
    SplitTextByNewDelimiter

 

If this post helps or solves your problem, please mark it as solution.
Kudos are nice to - thanks
Have fun

Jimmy

View solution in original post

Mariusz
Community Champion
Community Champion

Hi @PhilC 

 

a third option for you to choose.

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45Wcs5ILMhNzNNRcNFz1NNRcErMTi3SUQgCMkMy8nMTi4FCej56SrE60UouiWWZQH4gUM69KDGvREfBVy8IKBULAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Person = _t]),
    #"Replaced Value" = Table.ReplaceValue(Source,".,",".#",Replacer.ReplaceText,{"Person"}),
    #"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Replaced Value", {{"Person", Splitter.SplitTextByDelimiter("#", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Person"),
    #"Trimmed Text" = Table.TransformColumns(#"Split Column by Delimiter",{{"Person", Text.Trim, type text}})
in
    #"Trimmed Text"

 

Best Regards,
Mariusz

If this post helps, then please consider Accepting it as the solution.

 

View solution in original post

10 REPLIES 10
Mariusz
Community Champion
Community Champion

Hi @PhilC 

 

a third option for you to choose.

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45Wcs5ILMhNzNNRcNFz1NNRcErMTi3SUQgCMkMy8nMTi4FCej56SrE60UouiWWZQH4gUM69KDGvREfBVy8IKBULAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Person = _t]),
    #"Replaced Value" = Table.ReplaceValue(Source,".,",".#",Replacer.ReplaceText,{"Person"}),
    #"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Replaced Value", {{"Person", Splitter.SplitTextByDelimiter("#", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Person"),
    #"Trimmed Text" = Table.TransformColumns(#"Split Column by Delimiter",{{"Person", Text.Trim, type text}})
in
    #"Trimmed Text"

 

Best Regards,
Mariusz

If this post helps, then please consider Accepting it as the solution.

 

Hi @Mariusz , that is a good approach, keeps the "." portion of the delimiter, and results in the List that I am after.

 

Thanks.

Jimmy801
Community Champion
Community Champion

Hello @PhilC 

 

had created once a nice function that can be used here. Soltuion by @az38  is nice, but would work only in for this 3 cells. This function can be applied dynamically

(tSplitText as text, 
tDelimiter as text, 
nOccurance as nullable number) as list =>

let
    nOccuranceIntern = if nOccurance = null then 2 else nOccurance,
    Source = "Chapman, D.A., Baker, R., Thomas, B.L.",
    ListFromText = Text.ToList(tSplitText),
    CreateListOfDelimiter = List.Accumulate
        (
            ListFromText,
            [CurrentRow = -1, DelimiterFound={}],
            (state, current)=>
                if current = tDelimiter then
                        Record.TransformFields(state, {{"DelimiterFound", each _ & {state[CurrentRow]}},{"CurrentRow", each _ +1 }})
                        else
                        Record.TransformFields(state, {{"CurrentRow", each _ +1 }})
        )[DelimiterFound],
    ListAlternate = List.Alternate(CreateListOfDelimiter,nOccuranceIntern-1,nOccuranceIntern-1),
    ChangeTextList = List.Accumulate
        (
            ListAlternate,
            ListFromText,
            (state, current)=>
                List.ReplaceRange(state,current+1,1,{"%%%"})
        ),
    SplitTextByNewDelimiter = Text.Split
        (
            List.Accumulate
                (
                    ChangeTextList,
                    "",
                    (state, current)=> 
                        state & current
                ),
            "%%%"
         )


in
    SplitTextByNewDelimiter

 

If this post helps or solves your problem, please mark it as solution.
Kudos are nice to - thanks
Have fun

Jimmy

Hi @Jimmy801  above function is excellent and I have a similar problem where I want to get result in the group of 10 values ( trying to pass 10 containers to the string including '&' symbol except in the 10th posiiton which is api call to get the status) and then split into rows until the last. Could you please advise where do i need to modify value in this function ? Hi Team above function is excellent and I have a similar problem where I want to get result in the group of 10 values and then split into rows. Could you please advise where do i need to modify value in this function ? string.PNG

Hi @Jimmy801 , the function works, thanks.  I think splitting by ".," / ".#" is a simpler solution so will go with that one.

az38
Community Champion
Community Champion

Hi @PhilC 

try split by ".,"

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45Wcs5ILMhNzNNRcNFz1NNRcErMTi3SUQgCMkMy8nMTi4FCej56SrE60UouiWWZQH4gUM69KDGvREfBVy8IKBULAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Person = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Person", type text}}),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Changed Type", "Person", Splitter.SplitTextByDelimiter(".,", QuoteStyle.Csv), {"Person.1", "Person.2", "Person.3"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Person.1", type text}, {"Person.2", type text}, {"Person.3", type text}}),
    #"Added Index" = Table.AddIndexColumn(#"Changed Type1", "Index", 0, 1),
    #"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Added Index", {"Index"}, "Attribute", "Value"),
    #"Trimmed Text" = Table.TransformColumns(#"Unpivoted Other Columns",{{"Value", Text.Trim, type text}}),
    #"Added Custom" = Table.AddColumn(#"Trimmed Text", "Custom", each [Value]&"."),
    #"Replaced Value" = Table.ReplaceValue(#"Added Custom","..",".",Replacer.ReplaceText,{"Custom"}),
    #"Removed Other Columns" = Table.SelectColumns(#"Replaced Value",{"Custom"})
in
    #"Removed Other Columns"

do not hesitate to give a kudo to useful posts and mark solutions as solution
Linkedin


do not hesitate to give a kudo to useful posts and mark solutions as solution
LinkedIn

Hi @az38

 

Thanks for that, the suggestion for using ".," is what I needed, no idea why I did not see that.  Thank you.

 

Edit: just realised that it means the "." will be dropped when splitting, which is not ideal.

 

Regarding the code, I need an approach that does not split to columns due to the size of the contents (up to 1,000) - this was mentioned in the post, but did not mention that I have three other fields that also need to be expanded, so need to use Lists in the columns.

Cheers, Phil

Jimmy801
Community Champion
Community Champion

Hello @PhilC 

 

that is the downside using this "simply" splitting. You have to remaintain data afterwards. And if you applying the logic to a larger dataset, the chances are quite big, this "simplier" solutions won't work (more than 3 parts, ".," not present, but only ","

So "simply" use my function and you are at the save side 🙂

 

If this post helps or solves your problem, please mark it as solution.
Kudos are nice to - thanks
Have fun

Jimmy

Hi @Jimmy801 ,

 

I'd love to better understand your thoughts to ensure am using best options. 

 

What did you mean by?


You have to remaintain data afterwards.

 

How does your approach work better for larger datasets compared to using the Splitter function?  I guess an explanation of the steps in the function would help so I can follow what is being done and understand how it is more robust as an approach.

 

Cheers

Phil

Jimmy801
Community Champion
Community Champion

Hello @PhilC 

 

you have shown us only a dataset of 3 rows. So my thoughts on this are as follows

- there might be rows with more than 3 names in it

- there might be rows where you can't apply the ".," but only the real delimiter that is a "," and not the ".,"

- there might be the necessity to get one object to be handled afterwards (nested list) and not new columns based on the content (you would not be able to handle this in a dynamically way afterwards)

So whenever one of this occurs, the other solutions will fail. Maybe also my solution, because i didn't test it on a dataset like this. But I foreseen it. So applying my function, the above mentioned topics should be handled

 

All the best

 

Jimmy

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors