Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
Lwpro
Frequent Visitor

Text To Columns - GR

Good afternoon everyone,

I would like to ask for your help, as I am facing the following issue. I work with a product table, where I have a field for products. In this field, I have the name and the weight of each product (except for some that do not have it). In the table, I do not have a field just for the weight, and today the need arose for me to have this field. I thought about doing something similar to Excel's "text to columns", but so far, this solution is not meeting my needs, or perhaps I am not sure how to do it. I would like to ask for your help to see if it is possible to separate only the weight, creating another column through Power Query.

Some examples of products:

Chocolate 80g
White Chocolate 90g
Biscuit 60g Gold
Product from 100g to 500g

I would like to point out that they do not follow a pattern of being always at the far right. In some cases, the weight is in the middle of the name, in others to the right, and in other cases, the "g" is separated.

Do you know how to develop a solution like this through Power Query?

Thank you in advance.

2 ACCEPTED SOLUTIONS
dufoq3
Super User
Super User

Hi @Lwpro,

it is not an easy task because we don't see exact data, but something like this could help.

 

Edit 2024-03-01: I've updated my code a bit

It is mandantory to have space after each unit of measure (or it has to be at the end of text string)

 

Result:

dufoq3_0-1709317960499.png

 

You have to define units of measure in 2nd step (just separate them with coma)

dufoq3_0-1709318347743.png

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("ZY+xCsIwFEV/5VJwc3htreioDgGhKDg4lA6ljWkgTSRNBP/epA62mPHcw819VZWcetMa1TiOlGgFYEcsqddVcu9lgL94T2LiRzm2XjpsyfU4aC1f3I6NfSMQgfiYUd2kXq3pfOvwsGaI9QLOoKC/IgjcpAo9E59/CbZQ04wYMso2OHvNF1HpBUrjwhoUeRwRvfzwtFJ9zwmFNo4ABoWLbbTgyyDgMC8taC7UHw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Item = _t]),
    DefineUnitsOfMeasureHere = "g, kg, ml, l",
    Ad_Weight = Table.AddColumn(Source, "Weight", each [ l_units = List.Buffer(List.Transform(Text.Split(DefineUnitsOfMeasureHere, ","), Text.Trim)),
       l_removedSpaces = List.ReplaceMatchingItems(List.RemoveItems(Text.Split(Text.Upper([Item]), " "), {""}), List.Zip( { List.Transform(l_units, Text.Upper), l_units } )),
       t_replaceUnitsSpaces = 
           List.Accumulate(
              List.Zip( { List.Transform(l_units, each " " & _), l_units } ),
              Text.Combine(l_removedSpaces, " "),
              (s,c)=> Text.Replace(s, c{0}, c{1} )
           ),
       //Transform to lower text list and remove rows with numbers only    
       l_weight1 = List.Select(List.Transform(Text.Split(t_replaceUnitsSpaces, " "), Text.Lower), each (try Number.From(_) otherwise _) is text),
       
       //Select only valid weight
       l_weight2 =
            List.Accumulate(
                l_units,
                {},
                (s,c)=> s & List.Transform( l_weight1, each 
                                if Text.EndsWith(_, c)
                                and (try Number.From(Text.Replace(_, c, "")) otherwise _) is number
                                then _ else null )
            ),
       l_weightFinal = List.RemoveNulls(l_weight2),      
       t_result = if List.Count(l_weight2) > 1 then Text.Combine(l_weight2, " to ") else l_weight2{0}
     ][t_result], type text)
in
    Ad_Weight

 


Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

View solution in original post

m_dekorte
Super User
Super User

Hi @Lwpro,

 

Alternatively, you can try something like this.

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45Wcs7IT87PSSxJVVCwMHBXitWJVgrPyARyERKWBulgcafM4uTSzBIFBQUzg3QFEHDPz0kBSwUU5aeUJpcopBXl5yoYGgClS/IVTA3QNJoZKKQrBGfmlKUWgcWRrVAA2h0LAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Item = _t]),
    getWeight = Table.AddColumn(Source, "Weight", each 
        [
            a = List.RemoveMatchingItems( List.Transform( Splitter.SplitTextByCharacterTransition( each true, {"G", "g"})([Item]), (x)=> Text.Select(Text.Trim(x), {"0".."9"}) ), {"", null}),
            b = if List.Count(a) >1 then Text.Combine( a, "g to ") & "g" else List.First(a) & "g"
        ][b], type text
            
    )
in
    getWeight

 

That will get you this result.

m_dekorte_0-1709278139324.png

 

or maybe even this, as there appear to be no other numbers contained within your string.

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45Wcs7IT87PSSxJVVCwMHBXitWJVgrPyARyERKWBulgcafM4uTSzBIFBQUzg3QFEHDPz0kBSwUU5aeUJpcopBXl5yoYGgClS/IVTA3QNJoZKKQrBGfmlKUWgcWRrVAA2h0LAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Item = _t]),
    getWeight = Table.AddColumn(Source, "Weight", each 
        [
            a = Text.Split([Item], " "),
            b = List.RemoveMatchingItems( List.Transform(a, each Text.Select(_, {"0".."9"})), {"", null}),
            c = if List.Count(b) >1 then Text.Combine( b, "g to ") & "g" else List.First(b) & "g"
        ][c], type text
            
    )
in
    getWeight

I hope this is helpful

View solution in original post

4 REPLIES 4
m_dekorte
Super User
Super User

Hi @Lwpro,

 

Alternatively, you can try something like this.

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45Wcs7IT87PSSxJVVCwMHBXitWJVgrPyARyERKWBulgcafM4uTSzBIFBQUzg3QFEHDPz0kBSwUU5aeUJpcopBXl5yoYGgClS/IVTA3QNJoZKKQrBGfmlKUWgcWRrVAA2h0LAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Item = _t]),
    getWeight = Table.AddColumn(Source, "Weight", each 
        [
            a = List.RemoveMatchingItems( List.Transform( Splitter.SplitTextByCharacterTransition( each true, {"G", "g"})([Item]), (x)=> Text.Select(Text.Trim(x), {"0".."9"}) ), {"", null}),
            b = if List.Count(a) >1 then Text.Combine( a, "g to ") & "g" else List.First(a) & "g"
        ][b], type text
            
    )
in
    getWeight

 

That will get you this result.

m_dekorte_0-1709278139324.png

 

or maybe even this, as there appear to be no other numbers contained within your string.

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45Wcs7IT87PSSxJVVCwMHBXitWJVgrPyARyERKWBulgcafM4uTSzBIFBQUzg3QFEHDPz0kBSwUU5aeUJpcopBXl5yoYGgClS/IVTA3QNJoZKKQrBGfmlKUWgcWRrVAA2h0LAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Item = _t]),
    getWeight = Table.AddColumn(Source, "Weight", each 
        [
            a = Text.Split([Item], " "),
            b = List.RemoveMatchingItems( List.Transform(a, each Text.Select(_, {"0".."9"})), {"", null}),
            c = if List.Count(b) >1 then Text.Combine( b, "g to ") & "g" else List.First(b) & "g"
        ][c], type text
            
    )
in
    getWeight

I hope this is helpful

Hi @m_dekorte, to be honest I didn't know about Splitter.SplitTextByCharacterTransition so thank you 🙂

 

you just should repair this:

your v1

dufoq3_0-1709302632161.png

 

your v2

dufoq3_1-1709302716962.png

 

my version:

dufoq3_2-1709302734751.png

 

@Lwpro, if you find something else - let us know 😉


Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

Yes, that limitation was called out.

If a representative sample was provided there's no need for adjustment. Time will tell...

 

Great stuff.

dufoq3
Super User
Super User

Hi @Lwpro,

it is not an easy task because we don't see exact data, but something like this could help.

 

Edit 2024-03-01: I've updated my code a bit

It is mandantory to have space after each unit of measure (or it has to be at the end of text string)

 

Result:

dufoq3_0-1709317960499.png

 

You have to define units of measure in 2nd step (just separate them with coma)

dufoq3_0-1709318347743.png

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("ZY+xCsIwFEV/5VJwc3htreioDgGhKDg4lA6ljWkgTSRNBP/epA62mPHcw819VZWcetMa1TiOlGgFYEcsqddVcu9lgL94T2LiRzm2XjpsyfU4aC1f3I6NfSMQgfiYUd2kXq3pfOvwsGaI9QLOoKC/IgjcpAo9E59/CbZQ04wYMso2OHvNF1HpBUrjwhoUeRwRvfzwtFJ9zwmFNo4ABoWLbbTgyyDgMC8taC7UHw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Item = _t]),
    DefineUnitsOfMeasureHere = "g, kg, ml, l",
    Ad_Weight = Table.AddColumn(Source, "Weight", each [ l_units = List.Buffer(List.Transform(Text.Split(DefineUnitsOfMeasureHere, ","), Text.Trim)),
       l_removedSpaces = List.ReplaceMatchingItems(List.RemoveItems(Text.Split(Text.Upper([Item]), " "), {""}), List.Zip( { List.Transform(l_units, Text.Upper), l_units } )),
       t_replaceUnitsSpaces = 
           List.Accumulate(
              List.Zip( { List.Transform(l_units, each " " & _), l_units } ),
              Text.Combine(l_removedSpaces, " "),
              (s,c)=> Text.Replace(s, c{0}, c{1} )
           ),
       //Transform to lower text list and remove rows with numbers only    
       l_weight1 = List.Select(List.Transform(Text.Split(t_replaceUnitsSpaces, " "), Text.Lower), each (try Number.From(_) otherwise _) is text),
       
       //Select only valid weight
       l_weight2 =
            List.Accumulate(
                l_units,
                {},
                (s,c)=> s & List.Transform( l_weight1, each 
                                if Text.EndsWith(_, c)
                                and (try Number.From(Text.Replace(_, c, "")) otherwise _) is number
                                then _ else null )
            ),
       l_weightFinal = List.RemoveNulls(l_weight2),      
       t_result = if List.Count(l_weight2) > 1 then Text.Combine(l_weight2, " to ") else l_weight2{0}
     ][t_result], type text)
in
    Ad_Weight

 


Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

Helpful resources

Announcements
LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors