Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
pect
Helper I
Helper I

Split column in multiple rows

Hello,

 

I'd be very grateful if you could help me with an issue I've been struggling all the day through.

 

I have data in below format (1st image) where some columns have multiple values separated by commas. 

 

Capture.PNG

Instead, I need to split columns "task", "start date", "finish date" and "checkpoint" into as many rows as different data are in each cell. Note that each cell has the same number of items for each id (i.e. id A has 3 tasks, 3 start and finish dates, and  3 checkpoints whereas id B has 4 of each). So, the table I am trying to model is the following:

 

Capture.PNG

 

Thank you very much in advance.

 

Àlex

1 ACCEPTED SOLUTION
MarcelBeug
Community Champion
Community Champion

Yes, I overlooked one required adjustment: there is still a "Source" in that line of code, that should be adjusted to #"Promoted Headers".

Specializing in Power Query Formula Language (M)

View solution in original post

11 REPLIES 11
MFelix
Super User
Super User

Hi @pect,

 

You need to use the split into columns mode but with the option split to rows you need to do this for every column below see the steps and final result.

 

SplitColumns.png

 

Regards

MFelix


Regards

Miguel Félix


Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

Check out my blog: Power BI em Português



Thank you very much Felix.

 

I already tried what you propose but if I am afraid this is not what I am looking for. In your example your resulting table has 16 rows but it should only have 8 rows. The way you say it does a cartesian product resulting to repeated values I wouldn't need.

 

Àlex

Once you splitted the texts, you can combine the nested lists into tables and have these expanded, to prevent arriving at the cartesian product of all combinations.

 

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    SplittedTexts = Table.TransformColumns(Source,{{"task", each Text.Split(_,",")}, {"start date", each Text.Split(_,",")}, {"finish date", each Text.Split(_,",")}, {"checkpoint",  each Text.Split(_,",")}}),
    AddedTables = Table.AddColumn(SplittedTexts, "Tables", each Table.FromRows(List.Zip({[task],[start date],[finish date],[checkpoint]}))),
    RemovedColumns = Table.RemoveColumns(AddedTables,{"task", "start date", "finish date", "checkpoint"}),
    ExpandedTables = Table.ExpandTableColumn(RemovedColumns, "Tables", {"Column1", "Column2", "Column3", "Column4"}, List.Skip(Table.ColumnNames(Source),2)),
    ChangedType = Table.TransformColumnTypes(ExpandedTables,{{"id", type text}, {"project", type text}, {"task", Int64.Type}, {"start date", type date}, {"finish date", type date}, {"checkpoint", type text}}, "en-GB")
in
    ChangedType

 

Specializing in Power Query Formula Language (M)

Thank you very much Marcel.

 

I am trying to apply the script you send me but I am afraid I am not properly introducing it. I explain the steps I followed:

 

- First of all, I loaded the table from Excel and entered in the Power Query.

- Then, I promoted headers as by default the headers are "ColumnX".

- Afterwards, I went to the View>Advanced Editor and saw this code:

 

let
Source = Excel.Workbook(File.Contents("C:\Users\aoctavio\Desktop\Book1.xlsx"), null, true),
Sheet1_Sheet = Source{[Item="Sheet1",Kind="Sheet"]}[Data],
#"Changed Type" = Table.TransformColumnTypes(Sheet1_Sheet,{{"Column1", type text}, {"Column2", type text}, {"Column3", type text}, {"Column4", type text}, {"Column5", type text}, {"Column6", type text}}),
#"Promoted Headers" = Table.PromoteHeaders(#"Changed Type", [PromoteAllScalars=true])
in
#"Promoted Headers"

 

- There is where I am having trouble introducing the script you kindly sent me.

 

Could you please help me with this doubt? I'd really appreciate it.

 

Thanks again.

Àlex

MarcelBeug
Community Champion
Community Champion

Steps to take:

  • Copy my code from step "SplittedText" downwards,
  • paste that before your "ïn",
  • add a comma at the end of your step #"Promoted Headers",
  • adjust Source to #"Promoted Headers" in step "SplittedText"
  • delete your last 2 lines.

 

let
    Source = Excel.Workbook(File.Contents("C:\Users\aoctavio\Desktop\Book1.xlsx"), null, true),
    Sheet1_Sheet = Source{[Item="Sheet1",Kind="Sheet"]}[Data],
    #"Changed Type" = Table.TransformColumnTypes(Sheet1_Sheet,{{"Column1", type text}, {"Column2", type text}, {"Column3", type text}, {"Column4", type text}, {"Column5", type text}, {"Column6", type text}}),
    #"Promoted Headers" = Table.PromoteHeaders(#"Changed Type", [PromoteAllScalars=true]),
    SplittedTexts = Table.TransformColumns(#"Promoted Headers",{{"task", each Text.Split(_,",")}, {"start date", each Text.Split(_,",")}, {"finish date", each Text.Split(_,",")}, {"checkpoint",  each Text.Split(_,",")}}),
    AddedTables = Table.AddColumn(SplittedTexts, "Tables", each Table.FromRows(List.Zip({[task],[start date],[finish date],[checkpoint]}))),
    RemovedColumns = Table.RemoveColumns(AddedTables,{"task", "start date", "finish date", "checkpoint"}),
    ExpandedTables = Table.ExpandTableColumn(RemovedColumns, "Tables", {"Column1", "Column2", "Column3", "Column4"}, List.Skip(Table.ColumnNames(Source),2)),
    ChangedType = Table.TransformColumnTypes(ExpandedTables,{{"id", type text}, {"project", type text}, {"task", Int64.Type}, {"start date", type date}, {"finish date", type date}, {"checkpoint", type text}}, "en-GB")
in
    ChangedType

 

Specializing in Power Query Formula Language (M)

Thanks again Marcel! 

 

Still it appears this error:

Capture.PNG

 

Do you have any idea why it happened?

 

Thanks again and sorry for the inconvenience!

Àlex

MarcelBeug
Community Champion
Community Champion

Yes, I overlooked one required adjustment: there is still a "Source" in that line of code, that should be adjusted to #"Promoted Headers".

Specializing in Power Query Formula Language (M)

Excuse me Marcel,

 

I have a fast question about the line "ExpandedTables". The reason of the number 2 of the end of the line is the result of 6 (number of columns) - 4 (number of columns with multiple items)?

 

I thank you again for your huge help.

 

Regards.

 

Àlex

MarcelBeug
Community Champion
Community Champion

Correct. The 2 is the second argument of List.Skip.

In this case it means that the first 2 column names are skipped, so the last 4 column names are returned.

Specializing in Power Query Formula Language (M)

Thanks for all!

Thank you so so much!!

 

It worked perfectly! Really appreciate your help and time.

 

Best regards.

 

Àlex

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.