Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Jessi
Frequent Visitor

Splitting and pivoting a messy column

I have a column that is formatted like this:

 

attr1=text,attr2=more text,attr3=some more text, etc.

 

To separate this information, I split the column into rows by "," then into columns by "=" then pivoted which is straightforward. However, it turns out, the text itself often includes commas. Each row has a different number of these attributes. Additionally, the attributes themselves are not named in any easily isolatable way. It is very messy indeed. In reality, the column looks more like this:

 

a_list_of_things=first thing, second thing, and third thing,lights=the first light model number, second light model number, and third light model number,color_palette=sunrise, etc.

 

I would like to transform it into this:

IDa_list_of_thingslightscolor_palette
1first thing, second thing, and third thingthe first light model number, second light model number, and third light model numbersunrise
2nullan unspecified number of lightssunrise
3some more thingssome very bright lights, and some dim onesnull

where null means that row did not have that particular attribute.

 

My initial split by "," caused the pivot to create a massive number of columns that were not useful at all. My question to you is: how do I handle this?

 

I have a guess, but I do not know how to execute it. If I could do a substitution of every instance of the string ",[something]=" into "|[something]=" I could split on the "|" character instead, but even then, the formula wouldn't know which "," to look for. I'm at a loss. Any help would be appreciated!

1 ACCEPTED SOLUTION
camargos88
Community Champion
Community Champion

Hi @Jessi ,

 

Nice it worked!

 

Try this code for a dynamic column names:

 

let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("fVBBCoMwEPzKkrMUrfToJ3oVCdGsGojZko2F/r6amPYinjI7Q2Z3pm1FJQrxpJ7CDh7lrSy3V0lrOEgaZZiNm7gZjecAcSiAcSCn86QS9JmwZpoDN2FGSL8iAQtptODWpUf/sziT/n4n6kCWvHwpiyFgw6vzhlF0RSvuOccOqvIIchyjHKyOXziY0aA+zIDGtIIvbOtsW1/Vw7TgdqjH1AHnEiL/Rv+B3scwx74YMoraLEAOWXTdFw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, Name = _t, Price = _t, Attributes = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Name", type text}, {"Price", Int64.Type}, {"Attributes", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "table", each Table.PromoteHeaders(Table.Transpose(Table.FromList(Text.Split(Text.RemoveRange(Text.Combine(List.Transform(Text.Split([Attributes], ","), each if Text.Contains(_, "=") then "**" & _ else _)), 0, 2),"**"), Splitter.SplitTextByDelimiter("="))))),
#"Expanded table" = Table.ExpandTableColumn(#"Added Custom", "table", List.Distinct(List.Combine(List.Transform(#"Added Custom"[table], each Table.ColumnNames(_)))))
in
#"Expanded table"

 

The bold part gets the column names and expand them.



Did I answer your question? Mark my post as a solution!

Proud to be a Super User!



View solution in original post

7 REPLIES 7
v-easonf-msft
Community Support
Community Support

Hi , @Jessi 
Can you explain it more?

The initial sample data and expected results will help us better understand your problem

 

Best Regards,
Community Support Team _ Eason

Hello @v-easonf-msft and @camargos88 !

 

The data might start out looking like this:

 

IDNamePriceAttributes
1Robot150.00a_list_of_things=first thing, second thing, and third thing,lights=the first light model number, second light model number, and third light model number,color_palette=sunrise
2Robot2100.00lights=an unspecified number of lights,color_palette=sunrise
3Robot350.00a_list_of_things=some more things,lights=some very bright lights, and some dim ones

 

And I would like to turn it into a table that looks like this:

IDNamePricea_list_of_thingslightscolor_palette
1Robot150.00first thing, second thing, and third thingthe first light model number, second light model number, and third light modelsunrise
2Robot2100.00nullan unspecified number of lightssunrise
3Robot350.00some more thingssome very bright lights, and some dim onesnull

 

The attributes themselves contain a mixture of text, numbers, and, importantly, commas. There are more than 3 attributes, but this is just an example set. Does that help?

camargos88
Community Champion
Community Champion

Hi @Jessi ,

 

Try this m code:

let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("fVBBCoMwEPzKkrMUrfToJ3oVCdGsGojZko2F/r6amPYinjI7Q2Z3pm1FJQrxpJ7CDh7lrSy3V0lrOEgaZZiNm7gZjecAcSiAcSCn86QS9JmwZpoDN2FGSL8iAQtptODWpUf/sziT/n4n6kCWvHwpiyFgw6vzhlF0RSvuOccOqvIIchyjHKyOXziY0aA+zIDGtIIvbOtsW1/Vw7TgdqjH1AHnEiL/Rv+B3scwx74YMoraLEAOWXTdFw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, Name = _t, Price = _t, Attributes = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Name", type text}, {"Price", Int64.Type}, {"Attributes", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "table", each Table.PromoteHeaders(Table.Transpose(Table.FromList(
Text.Split(Text.RemoveRange(Text.Combine(List.Transform(Text.Split([Attributes], ","),
each if Text.Contains(_, "=") then "**" & _ else _)), 0, 2),"**"), Splitter.SplitTextByDelimiter("="))))),
#"Expanded table" = Table.ExpandTableColumn(#"Added Custom", "table", {"a_list_of_things", "lights", "color_palette"}, {"a_list_of_things", "lights", "color_palette"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded table",{"Attributes"})
in
#"Removed Columns"

 

Capture.PNG

 



Did I answer your question? Mark my post as a solution!

Proud to be a Super User!



Hello @camargos88 ,

 

This works great for the sample data! I also learned a lot about making subtables, so thank you for that. 😀

 

I have one more question. Is there a way to do this without knowing the column names ahead of time?

camargos88
Community Champion
Community Champion

Hi @Jessi ,

 

Nice it worked!

 

Try this code for a dynamic column names:

 

let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("fVBBCoMwEPzKkrMUrfToJ3oVCdGsGojZko2F/r6amPYinjI7Q2Z3pm1FJQrxpJ7CDh7lrSy3V0lrOEgaZZiNm7gZjecAcSiAcSCn86QS9JmwZpoDN2FGSL8iAQtptODWpUf/sziT/n4n6kCWvHwpiyFgw6vzhlF0RSvuOccOqvIIchyjHKyOXziY0aA+zIDGtIIvbOtsW1/Vw7TgdqjH1AHnEiL/Rv+B3scwx74YMoraLEAOWXTdFw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, Name = _t, Price = _t, Attributes = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Name", type text}, {"Price", Int64.Type}, {"Attributes", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "table", each Table.PromoteHeaders(Table.Transpose(Table.FromList(Text.Split(Text.RemoveRange(Text.Combine(List.Transform(Text.Split([Attributes], ","), each if Text.Contains(_, "=") then "**" & _ else _)), 0, 2),"**"), Splitter.SplitTextByDelimiter("="))))),
#"Expanded table" = Table.ExpandTableColumn(#"Added Custom", "table", List.Distinct(List.Combine(List.Transform(#"Added Custom"[table], each Table.ColumnNames(_)))))
in
#"Expanded table"

 

The bold part gets the column names and expand them.



Did I answer your question? Mark my post as a solution!

Proud to be a Super User!



Hi @camargos88 ,

 

I tried this with my real data this afternoon, and it worked like a charm! Thank you so much for your help. 🙂

camargos88
Community Champion
Community Champion

@Jessi ,

 

Can you post some data as example ?



Did I answer your question? Mark my post as a solution!

Proud to be a Super User!



Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.