Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi, I am trying to clean a manual input file.
So, I have a cell of one column with content like this: abc123abc456abc789
(1) and I want to replace the "abc" to become like this: abc123, abc456, abc789 (so excluding the first "abc" to be replaced).
(2) the next step is I want to split the content to be like this:
column 1 | > | column 123 | column 456 | column 789 |
abc123, abc456, abc789 | abc123 | abc456 | abc789 | |
abc456, abc789 | abc456 | abc789 | ||
abc123, abc789 | abc123 | abc789 |
Please be advised that I would like to have each of the steps get a solution.
Hope I can get the answer to it here.
Thanks all!
Solved! Go to Solution.
Solve your two problems at one go
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSkxKNjQyBpImpmZA0tzCUilWByyMIQBRBxaIBQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
#"Split Column" = Table.SplitColumn(Source, "Column1", Splitter.SplitTextByCharacterTransition({"0".."9"}, (c) => not List.Contains({"0".."9"}, c))),
#"Removed Nulls" = List.Transform(Table.ToRows(#"Split Column"), each Table.FromRecords({List.Accumulate(_, [], (s,c) => if not (c is null) then Record.AddField(s, Text.Select(c, {"0".."9"}), c) else s)})),
#"Combined Tables" = Table.Combine(#"Removed Nulls")
in
#"Combined Tables"
PS: my favorable solution with regular expression
Thanks to the great efforts by MS engineers to simplify syntax of DAX! Most beginners are SUCCESSFULLY MISLED to think that they could easily master DAX; but it turns out that the intricacy of the most frequently used RANKX() is still way beyond their comprehension! |
DAX is simple, but NOT EASY! |
Solve your two problems at one go
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSkxKNjQyBpImpmZA0tzCUilWByyMIQBRBxaIBQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
#"Split Column" = Table.SplitColumn(Source, "Column1", Splitter.SplitTextByCharacterTransition({"0".."9"}, (c) => not List.Contains({"0".."9"}, c))),
#"Removed Nulls" = List.Transform(Table.ToRows(#"Split Column"), each Table.FromRecords({List.Accumulate(_, [], (s,c) => if not (c is null) then Record.AddField(s, Text.Select(c, {"0".."9"}), c) else s)})),
#"Combined Tables" = Table.Combine(#"Removed Nulls")
in
#"Combined Tables"
PS: my favorable solution with regular expression
Thanks to the great efforts by MS engineers to simplify syntax of DAX! Most beginners are SUCCESSFULLY MISLED to think that they could easily master DAX; but it turns out that the intricacy of the most frequently used RANKX() is still way beyond their comprehension! |
DAX is simple, but NOT EASY! |
Here's one way to do it in the query editor. To see how it works, just create a blank query, open the Advanced Editor and replace the text there with the M code below.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSkxK1gNiQyNjEGViagaizC0slWJ14JLYRaFawKKxAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
#"Extracted Text After Delimiter" = Table.TransformColumns(Source, {{"Column1", each Text.AfterDelimiter(_, "abc."), type text}}),
#"Split Column by Delimiter" = Table.SplitColumn(#"Extracted Text After Delimiter", "Column1", Splitter.SplitTextByDelimiter(".", QuoteStyle.Csv), {"Column1", "Column2", "Column3"}),
#"Added Index" = Table.AddIndexColumn(#"Split Column by Delimiter", "Index", 1, 1, Int64.Type),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Added Index", {"Index"}, "Attribute", "Value"),
#"Added Custom" = Table.AddColumn(#"Unpivoted Other Columns", "Custom", each "Column" & Text.AfterDelimiter([Value], "abc")),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Attribute"}),
#"Pivoted Column" = Table.Pivot(#"Removed Columns", List.Distinct(#"Removed Columns"[Custom]), "Custom", "Value"),
#"Removed Columns1" = Table.RemoveColumns(#"Pivoted Column",{"Index"})
in
#"Removed Columns1"
Pat
To learn more about Power BI, follow me on Twitter or subscribe on YouTube.
Hi Pat,
Thanks for answering. It seems to answer my second problem.
But how to solve my first problem? Actually, my initial string doesn't have a dot separating each "abc"s. I still need to replace the "abc123abc456abc789" with "abc123, abc456, abc789" (replace "abc" to ", abc" but starting on the second abc/excluding the first abc). that way so that I can split using ", " delimiter to each individual column in an efficient manner.
also, do you have any solution to make it dynamic in case in the future I get more and more "abc" for the column split? because in my example, we only have 3 "abc"s and when I reproduce using 4 "abc"s I still have to open Settings in Split Column step to correctly split the 4th "abc".
Hope my questions and description are clear enough.
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
99 | |
50 | |
19 | |
12 | |
11 |