Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.
Hi, looking for assistance with manipulating columns in Power Query.
I need to split a column based on a Comma delimiter, however I need to only split every second one. The data is formatted Surname, Initials with a Comma delimiter between people. There are multiple people in each row (can be between 1 and 1,000 values in each row). I am not looking to split into columns as part of this process.
I want to extract into a list, but have each Surname, Initials combination, not Surname then Initials on the next row.
Example Data
Person |
Chapman, D.A., Baker, R., Thomas, B.L. |
Davis, Q., Grant, M.R. |
Desired Result
Person |
Chapman, D.A. |
Baker, R. |
Thomas, B.L. |
Grant, M.R. |
Davis, Q. |
Have been trying the code from here: https://community.powerbi.com/t5/Desktop/Text-Replace-specific-character-only-if-it-occurs-as-the-la...
Table.TransformColumns(#"Changed Type",{{"Column1", each Text.Split(_," ")}})
Hope that explains what I am after.
Cheers
Phil
Solved! Go to Solution.
Hi @PhilC
try split by ".,"
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45Wcs5ILMhNzNNRcNFz1NNRcErMTi3SUQgCMkMy8nMTi4FCej56SrE60UouiWWZQH4gUM69KDGvREfBVy8IKBULAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Person = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Person", type text}}),
#"Split Column by Delimiter" = Table.SplitColumn(#"Changed Type", "Person", Splitter.SplitTextByDelimiter(".,", QuoteStyle.Csv), {"Person.1", "Person.2", "Person.3"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Person.1", type text}, {"Person.2", type text}, {"Person.3", type text}}),
#"Added Index" = Table.AddIndexColumn(#"Changed Type1", "Index", 0, 1),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Added Index", {"Index"}, "Attribute", "Value"),
#"Trimmed Text" = Table.TransformColumns(#"Unpivoted Other Columns",{{"Value", Text.Trim, type text}}),
#"Added Custom" = Table.AddColumn(#"Trimmed Text", "Custom", each [Value]&"."),
#"Replaced Value" = Table.ReplaceValue(#"Added Custom","..",".",Replacer.ReplaceText,{"Custom"}),
#"Removed Other Columns" = Table.SelectColumns(#"Replaced Value",{"Custom"})
in
#"Removed Other Columns"
do not hesitate to give a kudo to useful posts and mark solutions as solution
Linkedin
Hello @PhilC
had created once a nice function that can be used here. Soltuion by @az38 is nice, but would work only in for this 3 cells. This function can be applied dynamically
(tSplitText as text,
tDelimiter as text,
nOccurance as nullable number) as list =>
let
nOccuranceIntern = if nOccurance = null then 2 else nOccurance,
Source = "Chapman, D.A., Baker, R., Thomas, B.L.",
ListFromText = Text.ToList(tSplitText),
CreateListOfDelimiter = List.Accumulate
(
ListFromText,
[CurrentRow = -1, DelimiterFound={}],
(state, current)=>
if current = tDelimiter then
Record.TransformFields(state, {{"DelimiterFound", each _ & {state[CurrentRow]}},{"CurrentRow", each _ +1 }})
else
Record.TransformFields(state, {{"CurrentRow", each _ +1 }})
)[DelimiterFound],
ListAlternate = List.Alternate(CreateListOfDelimiter,nOccuranceIntern-1,nOccuranceIntern-1),
ChangeTextList = List.Accumulate
(
ListAlternate,
ListFromText,
(state, current)=>
List.ReplaceRange(state,current+1,1,{"%%%"})
),
SplitTextByNewDelimiter = Text.Split
(
List.Accumulate
(
ChangeTextList,
"",
(state, current)=>
state & current
),
"%%%"
)
in
SplitTextByNewDelimiter
If this post helps or solves your problem, please mark it as solution.
Kudos are nice to - thanks
Have fun
Jimmy
Hi @PhilC
a third option for you to choose.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45Wcs5ILMhNzNNRcNFz1NNRcErMTi3SUQgCMkMy8nMTi4FCej56SrE60UouiWWZQH4gUM69KDGvREfBVy8IKBULAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Person = _t]),
#"Replaced Value" = Table.ReplaceValue(Source,".,",".#",Replacer.ReplaceText,{"Person"}),
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Replaced Value", {{"Person", Splitter.SplitTextByDelimiter("#", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Person"),
#"Trimmed Text" = Table.TransformColumns(#"Split Column by Delimiter",{{"Person", Text.Trim, type text}})
in
#"Trimmed Text"
Hi @PhilC
a third option for you to choose.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45Wcs5ILMhNzNNRcNFz1NNRcErMTi3SUQgCMkMy8nMTi4FCej56SrE60UouiWWZQH4gUM69KDGvREfBVy8IKBULAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Person = _t]),
#"Replaced Value" = Table.ReplaceValue(Source,".,",".#",Replacer.ReplaceText,{"Person"}),
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Replaced Value", {{"Person", Splitter.SplitTextByDelimiter("#", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Person"),
#"Trimmed Text" = Table.TransformColumns(#"Split Column by Delimiter",{{"Person", Text.Trim, type text}})
in
#"Trimmed Text"
Hi @Mariusz , that is a good approach, keeps the "." portion of the delimiter, and results in the List that I am after.
Thanks.
Hello @PhilC
had created once a nice function that can be used here. Soltuion by @az38 is nice, but would work only in for this 3 cells. This function can be applied dynamically
(tSplitText as text,
tDelimiter as text,
nOccurance as nullable number) as list =>
let
nOccuranceIntern = if nOccurance = null then 2 else nOccurance,
Source = "Chapman, D.A., Baker, R., Thomas, B.L.",
ListFromText = Text.ToList(tSplitText),
CreateListOfDelimiter = List.Accumulate
(
ListFromText,
[CurrentRow = -1, DelimiterFound={}],
(state, current)=>
if current = tDelimiter then
Record.TransformFields(state, {{"DelimiterFound", each _ & {state[CurrentRow]}},{"CurrentRow", each _ +1 }})
else
Record.TransformFields(state, {{"CurrentRow", each _ +1 }})
)[DelimiterFound],
ListAlternate = List.Alternate(CreateListOfDelimiter,nOccuranceIntern-1,nOccuranceIntern-1),
ChangeTextList = List.Accumulate
(
ListAlternate,
ListFromText,
(state, current)=>
List.ReplaceRange(state,current+1,1,{"%%%"})
),
SplitTextByNewDelimiter = Text.Split
(
List.Accumulate
(
ChangeTextList,
"",
(state, current)=>
state & current
),
"%%%"
)
in
SplitTextByNewDelimiter
If this post helps or solves your problem, please mark it as solution.
Kudos are nice to - thanks
Have fun
Jimmy
Hi @Jimmy801 above function is excellent and I have a similar problem where I want to get result in the group of 10 values ( trying to pass 10 containers to the string including '&' symbol except in the 10th posiiton which is api call to get the status) and then split into rows until the last. Could you please advise where do i need to modify value in this function ? Hi Team above function is excellent and I have a similar problem where I want to get result in the group of 10 values and then split into rows. Could you please advise where do i need to modify value in this function ?
Hi @Jimmy801 , the function works, thanks. I think splitting by ".," / ".#" is a simpler solution so will go with that one.
Hi @PhilC
try split by ".,"
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45Wcs5ILMhNzNNRcNFz1NNRcErMTi3SUQgCMkMy8nMTi4FCej56SrE60UouiWWZQH4gUM69KDGvREfBVy8IKBULAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Person = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Person", type text}}),
#"Split Column by Delimiter" = Table.SplitColumn(#"Changed Type", "Person", Splitter.SplitTextByDelimiter(".,", QuoteStyle.Csv), {"Person.1", "Person.2", "Person.3"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Person.1", type text}, {"Person.2", type text}, {"Person.3", type text}}),
#"Added Index" = Table.AddIndexColumn(#"Changed Type1", "Index", 0, 1),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Added Index", {"Index"}, "Attribute", "Value"),
#"Trimmed Text" = Table.TransformColumns(#"Unpivoted Other Columns",{{"Value", Text.Trim, type text}}),
#"Added Custom" = Table.AddColumn(#"Trimmed Text", "Custom", each [Value]&"."),
#"Replaced Value" = Table.ReplaceValue(#"Added Custom","..",".",Replacer.ReplaceText,{"Custom"}),
#"Removed Other Columns" = Table.SelectColumns(#"Replaced Value",{"Custom"})
in
#"Removed Other Columns"
do not hesitate to give a kudo to useful posts and mark solutions as solution
Linkedin
Hi @az38
Thanks for that, the suggestion for using ".," is what I needed, no idea why I did not see that. Thank you.
Edit: just realised that it means the "." will be dropped when splitting, which is not ideal.
Regarding the code, I need an approach that does not split to columns due to the size of the contents (up to 1,000) - this was mentioned in the post, but did not mention that I have three other fields that also need to be expanded, so need to use Lists in the columns.
Cheers, Phil
Hello @PhilC
that is the downside using this "simply" splitting. You have to remaintain data afterwards. And if you applying the logic to a larger dataset, the chances are quite big, this "simplier" solutions won't work (more than 3 parts, ".," not present, but only ","
So "simply" use my function and you are at the save side 🙂
If this post helps or solves your problem, please mark it as solution.
Kudos are nice to - thanks
Have fun
Jimmy
Hi @Jimmy801 ,
I'd love to better understand your thoughts to ensure am using best options.
What did you mean by?
You have to remaintain data afterwards.
How does your approach work better for larger datasets compared to using the Splitter function? I guess an explanation of the steps in the function would help so I can follow what is being done and understand how it is more robust as an approach.
Cheers
Phil
Hello @PhilC
you have shown us only a dataset of 3 rows. So my thoughts on this are as follows
- there might be rows with more than 3 names in it
- there might be rows where you can't apply the ".," but only the real delimiter that is a "," and not the ".,"
- there might be the necessity to get one object to be handled afterwards (nested list) and not new columns based on the content (you would not be able to handle this in a dynamically way afterwards)
So whenever one of this occurs, the other solutions will fail. Maybe also my solution, because i didn't test it on a dataset like this. But I foreseen it. So applying my function, the above mentioned topics should be handled
All the best
Jimmy