cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Regular Visitor

How to split a column by words

I have a column like this one:

 

Bricks 20KG Pallet 

Red Wood 5KG Box 4

Water 2L Bottle 6

Sand

Distilled Water 1L Bottle

 

I need to get 3 columns, one with the name of the product, another one with the format and another one with the packaging.

I was trying to use a number as a delimiter, but using that, I was getting "null" when there was no number.

Also, some products have two words as a name, so I cannot use a blank space as delimiter.

If I start from the end, I have the same problem, because some products have two word as packaging.

 

I was trying to do this for a few days but I have no idea how to do it.

Could somebody help me or give me some tips?

Thank you in advance

 

1 ACCEPTED SOLUTION
Super User I
Super User I

Hi @Manumitec ,
This should give you enough to go on.

1.Count the spaces

2.Build the logic based on number of spaces

3.Add conditions to your final 3 columns such as seen in picture below.

4. Build other columns to support your logic.

One thing that wasn't clear, was what a value was...water is the Format 2L or 2L Bottle and came in a package of 6. You would be well served to delineate this on paper first so that you know what columns you have to create, and how you are going to get them.  Also naming convention in the steps, such as I did for the 3 main columns.  At the end, delete all columns but the 3.  Use [space] as else value  so you don't get a null.

A lot of work!


Let me know if you have any questions.

If this solves your issues, please mark it as the solution, so that others can find it easily. Kudos 👍are nice too.
Nathaniel
Split.PNG

 

Split 1.PNG
Paste this into the Advanced Editor of a Blank Query

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WcirKTM4uVjAy8HZXCEjMyUktUVCK1YlWAhNBqSkK4fn5KQqmQFmn/AoFE4RceGJJapGCkQ9QvKQkJ1XBDCEVnJiXguC5ZBaXZAINBhoF1mII06IUGwsA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
    #"Trimmed Text" = Table.TransformColumns(#"Changed Type",{{"Column1", Text.Trim, type text}}),
    #"Cleaned Text" = Table.TransformColumns(#"Trimmed Text",{{"Column1", Text.Clean, type text}}),
    #"Removed Blank Rows1" = Table.SelectRows(#"Cleaned Text", each not List.IsEmpty(List.RemoveMatchingItems(Record.FieldValues(_), {"", null}))),
    #"Count Space" = Table.AddColumn(#"Removed Blank Rows1", "Count space", each List.Count(Text.Split([Column1]," "))-1),
    #"Changed Type1" = Table.TransformColumnTypes(#"Count Space",{{"Count space", Int64.Type}}),
    Custom = Table.AddColumn(#"Changed Type1", "Custom", each if [Count space] = 2 then "split by space" else if [Count space] = 0 then "split by space" else if [Count space] = 3 then "if first value is digit split there" else if [Count space] = 4 then "find first digit and split there" else null),
    #"Inserted Text Before Delimiter" = Table.AddColumn(Custom, "Text Before Delimiter", each Text.BeforeDelimiter([Column1], " "), type text),
    #"Inserted Text Before Delimiter1" = Table.AddColumn(#"Inserted Text Before Delimiter", "Text Before Delimiter.1", each Text.BeforeDelimiter([Column1], " ", 1), type text),
    #"Inserted Text Before Delimiter2" = Table.AddColumn(#"Inserted Text Before Delimiter1", "Text Before Delimiter.2", each Text.BeforeDelimiter([Column1], " ", 2), type text),
    #"Inserted Text After Delimiter" = Table.AddColumn(#"Inserted Text Before Delimiter2", "Text After Delimiter", each Text.AfterDelimiter([Column1], " ", 2), type text),
    #"Renamed Columns" = Table.RenameColumns(#"Inserted Text After Delimiter",{{"Text Before Delimiter", "0 or2"}}),
    Name = Table.AddColumn(#"Renamed Columns", "Name", each if [Count space] = 2 then [0 or2] else if [Count space] = 4 then [Text Before Delimiter.1] else " "),
    #"Changed Type2" = Table.TransformColumnTypes(Name,{{"Name", type text}}),
    #"Inserted Text After Delimiter1" = Table.AddColumn(#"Changed Type2", "Text After Delimiter.1", each Text.AfterDelimiter([Column1], " ", {0, RelativePosition.FromEnd}), type text),
    #"Renamed Columns1" = Table.RenameColumns(#"Inserted Text After Delimiter1",{{"Text After Delimiter.1", "After delimiter"}}),
    Package = Table.AddColumn(#"Renamed Columns1", "Packaging", each if [Count space] = 2 then [After delimiter] else if [Count space] = 4 then [Text After Delimiter] else "  "),
    #"Inserted Text Between Delimiters" = Table.AddColumn(Package, "Text Between Delimiters", each Text.BetweenDelimiters([Column1], " ", " "), type text),
    Format = Table.AddColumn(#"Inserted Text Between Delimiters", "Format", each if [Count space] = 2 then [Text Between Delimiters] else " "),
    #"Changed Type3" = Table.TransformColumnTypes(Format,{{"Format", type text}, {"Packaging", type text}}),
    #"Reordered Columns" = Table.ReorderColumns(#"Changed Type3",{"Name", "Format", "Packaging", "Column1", "Count space", "Custom", "0 or2", "Text Before Delimiter.1", "Text Before Delimiter.2", "Text After Delimiter", "After delimiter", "Text Between Delimiters"}),
    #"Renamed Columns2" = Table.RenameColumns(#"Reordered Columns",{{"Column1", "Original"}}),
    #"Changed Type4" = Table.TransformColumnTypes(#"Renamed Columns2",{{"Custom", type text}}),
    #"Renamed Columns3" = Table.RenameColumns(#"Changed Type4",{{"Custom", "Logic"}}),
    #"Inserted Text Between Delimiters1" = Table.AddColumn(#"Renamed Columns3", "Text Between Delimiters.1", each Text.BetweenDelimiters([Original], " ", " ", 1, 0), type text),
    #"Added Conditional Column" = Table.AddColumn(#"Inserted Text Between Delimiters1", "Package", each if [Count space] = 2 then [Text Between Delimiters] else if [Count space] = 4 then [Text Between Delimiters.1] else " "),
    #"Changed Type5" = Table.TransformColumnTypes(#"Added Conditional Column",{{"Package", type text}}),
    #"Reordered Columns1" = Table.ReorderColumns(#"Changed Type5",{"Name", "Format", "Package", "Packaging", "Original", "Count space", "Logic", "0 or2", "Text Before Delimiter.1", "Text Before Delimiter.2", "Text After Delimiter", "After delimiter", "Text Between Delimiters", "Text Between Delimiters.1"}),
    #"Removed Columns" = Table.RemoveColumns(#"Reordered Columns1",{"Format"}),
    #"Renamed Columns4" = Table.RenameColumns(#"Removed Columns",{{"Package", "Format"}})
in
    #"Renamed Columns4"




Did I answer your question? Mark my post as a solution!

Proud to be a Super User!




View solution in original post

7 REPLIES 7
Super User I
Super User I

Hi @Manumitec ,
Let me know if there are any issues, and I will circle back if I have time during the day, or at the end of day.  Hadn't seen the latest addition about the format and packaging.  That would simplify things.  One of the ifs could be if in a list of each packaging types, or {1,2,3,4,5,6,7,8,9,0} then do this.  I tried originally to separate at a digit, but as the col is text, does not work.  

 

Good Luck
Let me know if you have any questions.

If this solves your issues, please mark it as the solution, so that others can find it easily. Kudos 👍are nice too.
Nathaniel





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!




It's not exactly exactly what I was looking for but it helped me alot. I got an idea and I will apply what you said, and it will work probably.

Thank you so much!

Super User I
Super User I

Hi @Manumitec ,
This should give you enough to go on.

1.Count the spaces

2.Build the logic based on number of spaces

3.Add conditions to your final 3 columns such as seen in picture below.

4. Build other columns to support your logic.

One thing that wasn't clear, was what a value was...water is the Format 2L or 2L Bottle and came in a package of 6. You would be well served to delineate this on paper first so that you know what columns you have to create, and how you are going to get them.  Also naming convention in the steps, such as I did for the 3 main columns.  At the end, delete all columns but the 3.  Use [space] as else value  so you don't get a null.

A lot of work!


Let me know if you have any questions.

If this solves your issues, please mark it as the solution, so that others can find it easily. Kudos 👍are nice too.
Nathaniel
Split.PNG

 

Split 1.PNG
Paste this into the Advanced Editor of a Blank Query

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WcirKTM4uVjAy8HZXCEjMyUktUVCK1YlWAhNBqSkK4fn5KQqmQFmn/AoFE4RceGJJapGCkQ9QvKQkJ1XBDCEVnJiXguC5ZBaXZAINBhoF1mII06IUGwsA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
    #"Trimmed Text" = Table.TransformColumns(#"Changed Type",{{"Column1", Text.Trim, type text}}),
    #"Cleaned Text" = Table.TransformColumns(#"Trimmed Text",{{"Column1", Text.Clean, type text}}),
    #"Removed Blank Rows1" = Table.SelectRows(#"Cleaned Text", each not List.IsEmpty(List.RemoveMatchingItems(Record.FieldValues(_), {"", null}))),
    #"Count Space" = Table.AddColumn(#"Removed Blank Rows1", "Count space", each List.Count(Text.Split([Column1]," "))-1),
    #"Changed Type1" = Table.TransformColumnTypes(#"Count Space",{{"Count space", Int64.Type}}),
    Custom = Table.AddColumn(#"Changed Type1", "Custom", each if [Count space] = 2 then "split by space" else if [Count space] = 0 then "split by space" else if [Count space] = 3 then "if first value is digit split there" else if [Count space] = 4 then "find first digit and split there" else null),
    #"Inserted Text Before Delimiter" = Table.AddColumn(Custom, "Text Before Delimiter", each Text.BeforeDelimiter([Column1], " "), type text),
    #"Inserted Text Before Delimiter1" = Table.AddColumn(#"Inserted Text Before Delimiter", "Text Before Delimiter.1", each Text.BeforeDelimiter([Column1], " ", 1), type text),
    #"Inserted Text Before Delimiter2" = Table.AddColumn(#"Inserted Text Before Delimiter1", "Text Before Delimiter.2", each Text.BeforeDelimiter([Column1], " ", 2), type text),
    #"Inserted Text After Delimiter" = Table.AddColumn(#"Inserted Text Before Delimiter2", "Text After Delimiter", each Text.AfterDelimiter([Column1], " ", 2), type text),
    #"Renamed Columns" = Table.RenameColumns(#"Inserted Text After Delimiter",{{"Text Before Delimiter", "0 or2"}}),
    Name = Table.AddColumn(#"Renamed Columns", "Name", each if [Count space] = 2 then [0 or2] else if [Count space] = 4 then [Text Before Delimiter.1] else " "),
    #"Changed Type2" = Table.TransformColumnTypes(Name,{{"Name", type text}}),
    #"Inserted Text After Delimiter1" = Table.AddColumn(#"Changed Type2", "Text After Delimiter.1", each Text.AfterDelimiter([Column1], " ", {0, RelativePosition.FromEnd}), type text),
    #"Renamed Columns1" = Table.RenameColumns(#"Inserted Text After Delimiter1",{{"Text After Delimiter.1", "After delimiter"}}),
    Package = Table.AddColumn(#"Renamed Columns1", "Packaging", each if [Count space] = 2 then [After delimiter] else if [Count space] = 4 then [Text After Delimiter] else "  "),
    #"Inserted Text Between Delimiters" = Table.AddColumn(Package, "Text Between Delimiters", each Text.BetweenDelimiters([Column1], " ", " "), type text),
    Format = Table.AddColumn(#"Inserted Text Between Delimiters", "Format", each if [Count space] = 2 then [Text Between Delimiters] else " "),
    #"Changed Type3" = Table.TransformColumnTypes(Format,{{"Format", type text}, {"Packaging", type text}}),
    #"Reordered Columns" = Table.ReorderColumns(#"Changed Type3",{"Name", "Format", "Packaging", "Column1", "Count space", "Custom", "0 or2", "Text Before Delimiter.1", "Text Before Delimiter.2", "Text After Delimiter", "After delimiter", "Text Between Delimiters"}),
    #"Renamed Columns2" = Table.RenameColumns(#"Reordered Columns",{{"Column1", "Original"}}),
    #"Changed Type4" = Table.TransformColumnTypes(#"Renamed Columns2",{{"Custom", type text}}),
    #"Renamed Columns3" = Table.RenameColumns(#"Changed Type4",{{"Custom", "Logic"}}),
    #"Inserted Text Between Delimiters1" = Table.AddColumn(#"Renamed Columns3", "Text Between Delimiters.1", each Text.BetweenDelimiters([Original], " ", " ", 1, 0), type text),
    #"Added Conditional Column" = Table.AddColumn(#"Inserted Text Between Delimiters1", "Package", each if [Count space] = 2 then [Text Between Delimiters] else if [Count space] = 4 then [Text Between Delimiters.1] else " "),
    #"Changed Type5" = Table.TransformColumnTypes(#"Added Conditional Column",{{"Package", type text}}),
    #"Reordered Columns1" = Table.ReorderColumns(#"Changed Type5",{"Name", "Format", "Package", "Packaging", "Original", "Count space", "Logic", "0 or2", "Text Before Delimiter.1", "Text Before Delimiter.2", "Text After Delimiter", "After delimiter", "Text Between Delimiters", "Text Between Delimiters.1"}),
    #"Removed Columns" = Table.RemoveColumns(#"Reordered Columns1",{"Format"}),
    #"Renamed Columns4" = Table.RenameColumns(#"Removed Columns",{{"Package", "Format"}})
in
    #"Renamed Columns4"




Did I answer your question? Mark my post as a solution!

Proud to be a Super User!




View solution in original post

Super User III
Super User III

@Manumitec 

1. Format always start by a number? If not, how do we identify format?

2. What types of packagings are there?

 

Please mark the question solved when done and consider giving a thumbs up if posts are helpful.

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

Cheers 

SU18_powerbi_badge

 

Format always starts with a number, and packagings are "pallet" and "boxes".

 

Thank you again!

Super User III
Super User III

Hi @Manumitec 

You should first try to describe all the possible cases you can encounter and how you want to treat those. This will require the most work, probably. Then we can put together the code to accomplish it.  For instance, when there are no numbers, do you want all the content to be the product? 

 

Please mark the question solved when done and consider giving a thumbs up if posts are helpful.

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

Cheers 

SU18_powerbi_badge

Hi @AlB

Possible cases are:

Name of product (Red brick)

Name of produc, format (Red brick 10KG)

Name of product, packaging (Red brick pallet)

Name of product, format, packaging (Red brick 10KG pallet)

Packaging (Pallet)

 

But actually, I just need the rows that contains "Name of product, format, packaging (Red brick 10KG pallet)" and Name of produc, format (Red brick 10KG).

The rest of them can be null or blank.

 

Thank you!

Helpful resources

Announcements
secondImage

Happy New Year from Power BI

This is a must watch for a message from Power BI!

December Update

Check it Out!

Click here to read more about the December 2020 Updates!

Community Blog

Check it Out!

Click here to read the latest blog and learn more about contributing to the Power BI blog!

Get Ready for Power BI Dev Camp

Get Ready for Power BI Dev Camp

Mark your calendars and join us for our next Power BI Dev Camp!.

Top Solution Authors
Top Kudoed Authors