Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
Serdet
Post Patron
Post Patron

Clean Up Messy Column

I have a messy column due to free text options that needs cleaning up.

 

Column example

 

OLD INPUTREQUIRED OUTPUT (ID)REQUIRED OUTPUT (LETTERING)REQUIRED OUTPUT (%)
P12345P12345  
P87463 EP87463 E 
P87463 DP87463 DE 
P98264 M P98264  
P87463 40% EP87463 E40% 
P87463 45% EP87463  E45%
P87463 E 80%P87463  E80%
P87463 E 80.7%P87463  E80.7%
05435 E05435 E 
05435 D05435 D 
05435 E 20%05435 20%
05435 D 40%05435 D40%
05435 80% E05435 E80%
05435 76% D05435 D76%
00001 24.7% E00001 E24.7%
00001 E 30%00001E30%
00001 M00001 M 

 

Ideal output can be seen in the 3 columns to the right. 

 

Rules

ID's broken out into seperate column  (examples, P12345, 00001) 

Single lettering broken out into seperate column (E, M, D)

Percentage broken out into seperate column 

 

Any ideas how this can be achieved within Power Query?

 

1 ACCEPTED SOLUTION
mahenkj2
Solution Sage
Solution Sage

Hi @Serdet ,

if format is fixed as you show, this this can be done as below:

 

1. Split with space delimiter.

2. Add conditional column with checking the middle column generated in above for containing % and if now % symbol, then its lettering column.

3, Add one more conditional column with checking midle column for containing % and state reverse condition of 2 above, this is % output.

 

Done.

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("bZBNCsQgDIWvEgruymA1/sxel0L3pfe/xhgjU6MtWJ7vfcbE69rOw1h02/4IaOveaxYDeguZUpZV5TlOT5xk/o3GIxQ61XVVZamPWr3cQa6gnKD+mFOiV4harRSZE/UJrxzZRGqH1rULWYnJ2EpjmKYwg2mNPKfpZ/RYPdHocxEUSOxvI5qIAglera1Uk5H6HWCwjsV12r7XafaAZbDcD+06Y/VIlLFE4ZHvHw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"OLD INPUT" = _t, #"REQUIRED OUTPUT (ID)" = _t, #"REQUIRED OUTPUT (LETTERING)" = _t, #"REQUIRED OUTPUT (%)" = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"OLD INPUT", type text}, {"REQUIRED OUTPUT (ID)", type text}, {"REQUIRED OUTPUT (LETTERING)", type text}, {"REQUIRED OUTPUT (%)", Percentage.Type}}),
    #"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"REQUIRED OUTPUT (ID)", "REQUIRED OUTPUT (LETTERING)", "REQUIRED OUTPUT (%)"}),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Removed Columns", "OLD INPUT", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"OLD INPUT.1", "OLD INPUT.2", "OLD INPUT.3"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"OLD INPUT.1", type text}, {"OLD INPUT.2", type text}, {"OLD INPUT.3", type text}}),
    #"Renamed Columns" = Table.RenameColumns(#"Changed Type1",{{"OLD INPUT.1", "ID"}}),
    #"Replaced Value" = Table.ReplaceValue(#"Renamed Columns",null,"",Replacer.ReplaceValue,{"OLD INPUT.2"}),
    #"Replaced Value1" = Table.ReplaceValue(#"Replaced Value",null,"",Replacer.ReplaceValue,{"OLD INPUT.3"}),
    #"Added Conditional Column" = Table.AddColumn(#"Replaced Value1", "Lettering", each if Text.Contains([OLD INPUT.2], "%") then [OLD INPUT.3] else [OLD INPUT.2]),
    #"Added Conditional Column1" = Table.AddColumn(#"Added Conditional Column", "Output%", each if Text.Contains([OLD INPUT.2], "%") then [OLD INPUT.2] else [OLD INPUT.3]),
    #"Changed Type2" = Table.TransformColumnTypes(#"Added Conditional Column1",{{"Lettering", type text}, {"Output%", Percentage.Type}}),
    #"Removed Columns1" = Table.RemoveColumns(#"Changed Type2",{"OLD INPUT.2", "OLD INPUT.3"})
in
    #"Removed Columns1"

 

mahenkj2_0-1654088330918.png

 

Hope it helps.

View solution in original post

7 REPLIES 7
mahenkj2
Solution Sage
Solution Sage

Hi @Serdet ,

if format is fixed as you show, this this can be done as below:

 

1. Split with space delimiter.

2. Add conditional column with checking the middle column generated in above for containing % and if now % symbol, then its lettering column.

3, Add one more conditional column with checking midle column for containing % and state reverse condition of 2 above, this is % output.

 

Done.

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("bZBNCsQgDIWvEgruymA1/sxel0L3pfe/xhgjU6MtWJ7vfcbE69rOw1h02/4IaOveaxYDeguZUpZV5TlOT5xk/o3GIxQ61XVVZamPWr3cQa6gnKD+mFOiV4harRSZE/UJrxzZRGqH1rULWYnJ2EpjmKYwg2mNPKfpZ/RYPdHocxEUSOxvI5qIAglera1Uk5H6HWCwjsV12r7XafaAZbDcD+06Y/VIlLFE4ZHvHw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"OLD INPUT" = _t, #"REQUIRED OUTPUT (ID)" = _t, #"REQUIRED OUTPUT (LETTERING)" = _t, #"REQUIRED OUTPUT (%)" = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"OLD INPUT", type text}, {"REQUIRED OUTPUT (ID)", type text}, {"REQUIRED OUTPUT (LETTERING)", type text}, {"REQUIRED OUTPUT (%)", Percentage.Type}}),
    #"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"REQUIRED OUTPUT (ID)", "REQUIRED OUTPUT (LETTERING)", "REQUIRED OUTPUT (%)"}),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Removed Columns", "OLD INPUT", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"OLD INPUT.1", "OLD INPUT.2", "OLD INPUT.3"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"OLD INPUT.1", type text}, {"OLD INPUT.2", type text}, {"OLD INPUT.3", type text}}),
    #"Renamed Columns" = Table.RenameColumns(#"Changed Type1",{{"OLD INPUT.1", "ID"}}),
    #"Replaced Value" = Table.ReplaceValue(#"Renamed Columns",null,"",Replacer.ReplaceValue,{"OLD INPUT.2"}),
    #"Replaced Value1" = Table.ReplaceValue(#"Replaced Value",null,"",Replacer.ReplaceValue,{"OLD INPUT.3"}),
    #"Added Conditional Column" = Table.AddColumn(#"Replaced Value1", "Lettering", each if Text.Contains([OLD INPUT.2], "%") then [OLD INPUT.3] else [OLD INPUT.2]),
    #"Added Conditional Column1" = Table.AddColumn(#"Added Conditional Column", "Output%", each if Text.Contains([OLD INPUT.2], "%") then [OLD INPUT.2] else [OLD INPUT.3]),
    #"Changed Type2" = Table.TransformColumnTypes(#"Added Conditional Column1",{{"Lettering", type text}, {"Output%", Percentage.Type}}),
    #"Removed Columns1" = Table.RemoveColumns(#"Changed Type2",{"OLD INPUT.2", "OLD INPUT.3"})
in
    #"Removed Columns1"

 

mahenkj2_0-1654088330918.png

 

Hope it helps.

mh2587
Super User
Super User

https://filebin.net/qmjy2yonk4acwz9c

Check this one


Did I answer your question? If so, please mark my post as a solution!


Proud to be a Super User!




LinkedIn Icon
Muhammad Hasnain



mh2587
Super User
Super User

What would be the expected output?

 


Did I answer your question? If so, please mark my post as a solution!


Proud to be a Super User!




LinkedIn Icon
Muhammad Hasnain



Hi,

 

The output can be seen in the table shown above. First column (original messy input), next three columns are the desired output.

https://docs.microsoft.com/en-us/power-query/split-columns-delimiter

Read this article


Did I answer your question? If so, please mark my post as a solution!


Proud to be a Super User!




LinkedIn Icon
Muhammad Hasnain



Thanks, this is great but doesnt solve that the original input has different orders.

 

For example an entry could look like any of the following

 

00001 E 30%

00001 30% E

 

After splitting columns via the delimited the two secondary columns contain [Letters] & [Percentages].

 

I need one column for ID, one for Lettering, one for %.

Okay let my try then i upload the file


Did I answer your question? If so, please mark my post as a solution!


Proud to be a Super User!




LinkedIn Icon
Muhammad Hasnain



Helpful resources

Announcements
LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.