Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.
I have a messy column due to free text options that needs cleaning up.
Column example
OLD INPUT | REQUIRED OUTPUT (ID) | REQUIRED OUTPUT (LETTERING) | REQUIRED OUTPUT (%) |
P12345 | P12345 | ||
P87463 E | P87463 | E | |
P87463 D | P87463 D | E | |
P98264 M | P98264 | M | |
P87463 40% E | P87463 | E | 40% |
P87463 45% E | P87463 | E | 45% |
P87463 E 80% | P87463 | E | 80% |
P87463 E 80.7% | P87463 | E | 80.7% |
05435 E | 05435 | E | |
05435 D | 05435 | D | |
05435 E 20% | 05435 | E | 20% |
05435 D 40% | 05435 | D | 40% |
05435 80% E | 05435 | E | 80% |
05435 76% D | 05435 | D | 76% |
00001 24.7% E | 00001 | E | 24.7% |
00001 E 30% | 00001 | E | 30% |
00001 M | 00001 | M |
Ideal output can be seen in the 3 columns to the right.
Rules
ID's broken out into seperate column (examples, P12345, 00001)
Single lettering broken out into seperate column (E, M, D)
Percentage broken out into seperate column
Any ideas how this can be achieved within Power Query?
Solved! Go to Solution.
Hi @Serdet ,
if format is fixed as you show, this this can be done as below:
1. Split with space delimiter.
2. Add conditional column with checking the middle column generated in above for containing % and if now % symbol, then its lettering column.
3, Add one more conditional column with checking midle column for containing % and state reverse condition of 2 above, this is % output.
Done.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("bZBNCsQgDIWvEgruymA1/sxel0L3pfe/xhgjU6MtWJ7vfcbE69rOw1h02/4IaOveaxYDeguZUpZV5TlOT5xk/o3GIxQ61XVVZamPWr3cQa6gnKD+mFOiV4harRSZE/UJrxzZRGqH1rULWYnJ2EpjmKYwg2mNPKfpZ/RYPdHocxEUSOxvI5qIAglera1Uk5H6HWCwjsV12r7XafaAZbDcD+06Y/VIlLFE4ZHvHw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"OLD INPUT" = _t, #"REQUIRED OUTPUT (ID)" = _t, #"REQUIRED OUTPUT (LETTERING)" = _t, #"REQUIRED OUTPUT (%)" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"OLD INPUT", type text}, {"REQUIRED OUTPUT (ID)", type text}, {"REQUIRED OUTPUT (LETTERING)", type text}, {"REQUIRED OUTPUT (%)", Percentage.Type}}),
#"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"REQUIRED OUTPUT (ID)", "REQUIRED OUTPUT (LETTERING)", "REQUIRED OUTPUT (%)"}),
#"Split Column by Delimiter" = Table.SplitColumn(#"Removed Columns", "OLD INPUT", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"OLD INPUT.1", "OLD INPUT.2", "OLD INPUT.3"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"OLD INPUT.1", type text}, {"OLD INPUT.2", type text}, {"OLD INPUT.3", type text}}),
#"Renamed Columns" = Table.RenameColumns(#"Changed Type1",{{"OLD INPUT.1", "ID"}}),
#"Replaced Value" = Table.ReplaceValue(#"Renamed Columns",null,"",Replacer.ReplaceValue,{"OLD INPUT.2"}),
#"Replaced Value1" = Table.ReplaceValue(#"Replaced Value",null,"",Replacer.ReplaceValue,{"OLD INPUT.3"}),
#"Added Conditional Column" = Table.AddColumn(#"Replaced Value1", "Lettering", each if Text.Contains([OLD INPUT.2], "%") then [OLD INPUT.3] else [OLD INPUT.2]),
#"Added Conditional Column1" = Table.AddColumn(#"Added Conditional Column", "Output%", each if Text.Contains([OLD INPUT.2], "%") then [OLD INPUT.2] else [OLD INPUT.3]),
#"Changed Type2" = Table.TransformColumnTypes(#"Added Conditional Column1",{{"Lettering", type text}, {"Output%", Percentage.Type}}),
#"Removed Columns1" = Table.RemoveColumns(#"Changed Type2",{"OLD INPUT.2", "OLD INPUT.3"})
in
#"Removed Columns1"
Hope it helps.
Hi @Serdet ,
if format is fixed as you show, this this can be done as below:
1. Split with space delimiter.
2. Add conditional column with checking the middle column generated in above for containing % and if now % symbol, then its lettering column.
3, Add one more conditional column with checking midle column for containing % and state reverse condition of 2 above, this is % output.
Done.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("bZBNCsQgDIWvEgruymA1/sxel0L3pfe/xhgjU6MtWJ7vfcbE69rOw1h02/4IaOveaxYDeguZUpZV5TlOT5xk/o3GIxQ61XVVZamPWr3cQa6gnKD+mFOiV4harRSZE/UJrxzZRGqH1rULWYnJ2EpjmKYwg2mNPKfpZ/RYPdHocxEUSOxvI5qIAglera1Uk5H6HWCwjsV12r7XafaAZbDcD+06Y/VIlLFE4ZHvHw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"OLD INPUT" = _t, #"REQUIRED OUTPUT (ID)" = _t, #"REQUIRED OUTPUT (LETTERING)" = _t, #"REQUIRED OUTPUT (%)" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"OLD INPUT", type text}, {"REQUIRED OUTPUT (ID)", type text}, {"REQUIRED OUTPUT (LETTERING)", type text}, {"REQUIRED OUTPUT (%)", Percentage.Type}}),
#"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"REQUIRED OUTPUT (ID)", "REQUIRED OUTPUT (LETTERING)", "REQUIRED OUTPUT (%)"}),
#"Split Column by Delimiter" = Table.SplitColumn(#"Removed Columns", "OLD INPUT", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"OLD INPUT.1", "OLD INPUT.2", "OLD INPUT.3"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"OLD INPUT.1", type text}, {"OLD INPUT.2", type text}, {"OLD INPUT.3", type text}}),
#"Renamed Columns" = Table.RenameColumns(#"Changed Type1",{{"OLD INPUT.1", "ID"}}),
#"Replaced Value" = Table.ReplaceValue(#"Renamed Columns",null,"",Replacer.ReplaceValue,{"OLD INPUT.2"}),
#"Replaced Value1" = Table.ReplaceValue(#"Replaced Value",null,"",Replacer.ReplaceValue,{"OLD INPUT.3"}),
#"Added Conditional Column" = Table.AddColumn(#"Replaced Value1", "Lettering", each if Text.Contains([OLD INPUT.2], "%") then [OLD INPUT.3] else [OLD INPUT.2]),
#"Added Conditional Column1" = Table.AddColumn(#"Added Conditional Column", "Output%", each if Text.Contains([OLD INPUT.2], "%") then [OLD INPUT.2] else [OLD INPUT.3]),
#"Changed Type2" = Table.TransformColumnTypes(#"Added Conditional Column1",{{"Lettering", type text}, {"Output%", Percentage.Type}}),
#"Removed Columns1" = Table.RemoveColumns(#"Changed Type2",{"OLD INPUT.2", "OLD INPUT.3"})
in
#"Removed Columns1"
Hope it helps.
https://filebin.net/qmjy2yonk4acwz9c
Check this one
Did I answer your question? If so, please mark my post as a solution!
Proud to be a Super User!
What would be the expected output?
Did I answer your question? If so, please mark my post as a solution!
Proud to be a Super User!
Hi,
The output can be seen in the table shown above. First column (original messy input), next three columns are the desired output.
https://docs.microsoft.com/en-us/power-query/split-columns-delimiter
Read this article
Did I answer your question? If so, please mark my post as a solution!
Proud to be a Super User!
Thanks, this is great but doesnt solve that the original input has different orders.
For example an entry could look like any of the following
00001 E 30%
00001 30% E
After splitting columns via the delimited the two secondary columns contain [Letters] & [Percentages].
I need one column for ID, one for Lettering, one for %.
Okay let my try then i upload the file
Did I answer your question? If so, please mark my post as a solution!
Proud to be a Super User!
User | Count |
---|---|
97 | |
85 | |
77 | |
66 | |
62 |
User | Count |
---|---|
110 | |
96 | |
95 | |
64 | |
57 |