Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
piorsenpai
Frequent Visitor

Split multiple row to multiple row by line break based on character conditions

Hello everyone,

 

I have a dataframe that looks as follows:

image.png

 
 

I want to split multiple row to multiple row by line break and first character each line.

 

Here is the result I want:

image.png

I have listed all the case that appear on table above.

 

Here is excel file for more convenient:

https://1drv.ms/x/s!AgaDgBBM4rAGli_UZM0WlmdiT7wj?e=L8WpTG

 

Thank you!

 

1 ACCEPTED SOLUTION

Here you go. 

 

let
    Source = Excel.Workbook(File.Contents("C:\Users\Ed Hansberry\OneDrive\Downloads\split_test.xlsx"), null, true),
    test_Sheet = Source{[Item="test",Kind="Sheet"]}[Data],
    #"Promoted Headers" = Table.PromoteHeaders(test_Sheet, [PromoteAllScalars=true]),
    #"Filtered Rows" = Table.SelectRows(#"Promoted Headers", each ([checklist] <> null)),
    #"Grouped Rows" = Table.Group(#"Filtered Rows", {"num", "checklist"}, {{"All Rows", each _, type table [num=number, checklist=text, sop=text, expected=text]}}),
    #"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Grouped Rows", {{"checklist", Splitter.SplitTextByDelimiter("#(lf)", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "checklist"),
    #"Filtered Rows1" = Table.SelectRows(#"Split Column by Delimiter", each ([checklist] <> "")),
    #"Added First Column" = 
        Table.AddColumn(
            #"Filtered Rows1", 
            "First Column",
            each 
                let
                    varGroupText = Text.Start([checklist], 1)
                in
                Text.Combine(
                    Table.SelectRows(
                        Table.FromList(
                            Text.Split([All Rows][sop]{0}, "#(lf)"),
                            Splitter.SplitByNothing()
                        ),
                        each Text.Start([Column1],1) = varGroupText
                    )[Column1],
                "#(lf)"
                )
            ),
    #"Added Second Column" = 
        Table.AddColumn(
            #"Added First Column", 
            "Second Column",
            each 
                let
                    varGroupText = Text.Start([checklist], 1)
                in
                Text.Combine(
                    Table.SelectRows(
                        Table.FromList(
                            Text.Split([All Rows][expected]{0}, "#(lf)"),
                            Splitter.SplitByNothing()
                        ),
                        each Text.Start([Column1],1) = varGroupText
                    )[Column1],
                "#(lf)"
                )
            ),
    #"Removed Other Columns" = Table.SelectColumns(#"Added Second Column",{"num", "checklist", "First Column", "Second Column"})
in
    #"Removed Other Columns"

 

Again, change the source to your file. THe key was adding 

Splitter.SplitByNothing()

to the Table.FromList functions in each column. 

 

Apparently the unusual ASCII characters was causing Power Query to split more than you wanted, so this forces no splitting within a Table.FromList. If @ImkeF knows more I'd love her wisdom on this usage.

edhans_0-1613449374564.png

You need to be 100% sure nothing is being dropped. I cannot read that language so it isn't efficient for me to do character comparisons.

 



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

View solution in original post

13 REPLIES 13
piorsenpai
Frequent Visitor

Hello everyone,

 

I have a dataframe that looks as follows:

 

numchecklistsopexpected
1

A. (text)

B. (text)

C. (text)

D. (text)

A. (text)

B. (text)

C. (text)

D. (text)

A. (text)

B. (text)

C. (text)

D. (text)

2

A. (text)

B. (text)

C. (text)

A 1. (text)

A 2. (text)

B 1. (text)

B 2. (text)

C. (text)

A 1. (text)

A 2. (text)

B 1-1. (text)

B 1-2. (text)

B 2. (test)

C 1. (text)

C 2. (text)

3

A. (text)

B. (text)

C. (text)

A. (text)

B 1. (text)

B 2. (text)

C. (text)

A. (text)

C. (text)

 

 

I want to split multiple row to multiple row by line break and first character each line.

Here is the result I want:

numchecklistsopexpected
1A. (text)A. (text)A. (text)
1B. (text)B. (text)B. (text)
1C. (text)C. (text)C. (text)
1D. (text)D. (text)D. (text)
2A. (text)

A 1. (text)

A 2. (text)

A 1. (text)

A 2. (text)

2B. (text)

B 1. (text)

B 2. (text)

B 1-1. (text)

B 1-2. (text)

B 2. (test)

2C. (text)C. (text)

C 1. (text)

C 2. (text)

3A. (text)A. (text)A. (text)
3B. (text)

B 1. (text)

B 2. (text)

 

3C. (text)C. (text)C. (text)

 

Here is excel file for more convenient:

https://1drv.ms/x/s!AgaDgBBM4rAGli_UZM0WlmdiT7wj?e=L8WpTG

 

I have listed all the case that appear on table above.

 

Thank you!

I cannot tell what you are wanting based on what you have posted @piorsenpai - that is just a lot of text that when I paste it into Excel or a text editor, it is one long string. 

 

But if you want to break based on line feeds, you can use the Split Column feature and use special characters.

edhans_0-1613405610801.png

It turns this:

edhans_1-1613405645266.png

into this:

edhans_2-1613405661560.png

If you need more help, I think you are going to have to share an Excel file via OneDrive or Dropbox with real sample data. Pasting text into the forum will destroy any of the special characters you are trying to parse. That includes your expected results. It looks a bit jumbled to me based on what the forum did with it.

 

 

 



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

Sorry for that inconvenient,

 

Here is excel file which I added Data sample and Result that I want in 2 sheets:

https://1drv.ms/x/s!AgaDgBBM4rAGli_UZM0WlmdiT7wj?e=rG2tqQ

 

If possible, I hope I can have a method for multiple row like that not just 3 rows in file excel sample because the file which I am working on have thousands of row like that.

Ok, see if this works. It turns this:

edhans_0-1613412771396.png

into this:

edhans_1-1613412799061.png

This is the full code:

let
    Source = Excel.Workbook(File.Contents("C:\Users\Ed Hansberry\OneDrive\Downloads\test.xlsx"), null, true),
    Data_Sheet = Source{[Item="Data",Kind="Sheet"]}[Data],
    #"Promoted Headers" = Table.PromoteHeaders(Data_Sheet, [PromoteAllScalars=true]),
    #"Filtered Rows" = Table.SelectRows(#"Promoted Headers", each ([checklist] <> null)),
    #"Grouped Rows" = Table.Group(#"Filtered Rows", {"num", "checklist"}, {{"All Rows", each _, type table [num=number, checklist=text, sop=text, expected=text]}}),
    #"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Grouped Rows", {{"checklist", Splitter.SplitTextByDelimiter("#(lf)", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "checklist"),
    #"Filtered Rows1" = Table.SelectRows(#"Split Column by Delimiter", each ([checklist] <> "")),
    #"Added First Column" = 
        Table.AddColumn(
            #"Filtered Rows1", 
            "First Column",
            each 
                let
                    varGroupText = Text.Start([checklist], 1)
                in
                Text.Combine(
                    Table.SelectRows(
                        Table.FromList(
                            Text.Split([All Rows][sop]{0}, "#(lf)")
                        ),
                        each Text.Start([Column1],1) = varGroupText
                    )[Column1],
                "#(lf)"
                )
            ),
    #"Added Second Column" = 
        Table.AddColumn(
            #"Added First Column", 
            "Second Column",
            each 
                let
                    varGroupText = Text.Start([checklist], 1)
                in
                Text.Combine(
                    Table.SelectRows(
                        Table.FromList(
                            Text.Split([All Rows][expected]{0}, "#(lf)")
                        ),
                        each Text.Start([Column1],1) = varGroupText
                    )[Column1],
                "#(lf)"
                )
            ),
    #"Removed Other Columns" = Table.SelectColumns(#"Added Second Column",{"num", "checklist", "First Column", "Second Column"})
in
    #"Removed Other Columns"

You will need to change your path in the SOURCE line for it to connect to your test.xlsx file on your hard drive.

 



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting
ImkeF
Super User
Super User

Hey @edhans ,

ha, ha, that's pretty mean.
Unfortunately I also don't have any idea what the undocumented default-splitter function is. 
Also have no idea why someone considered it not to be Splitter.SplitByNothing 😉

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

edhans
Super User
Super User

See this code @piorsenpai 

 

let
    Source = Excel.Workbook(File.Contents("C:\Users\Ed Hansberry\OneDrive\Downloads\test.xlsx"), null, true),
    Data_Sheet = Source{[Item="Data",Kind="Sheet"]}[Data],
    #"Promoted Headers" = Table.PromoteHeaders(Data_Sheet, [PromoteAllScalars=true]),
    #"Filtered Rows" = Table.SelectRows(#"Promoted Headers", each ([checklist] <> null)),
    #"Grouped Rows" = Table.Group(#"Filtered Rows", {"num", "checklist"}, {{"All Rows", each _, type table [num=number, checklist=text, sop=text, expected=text]}}),
    #"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Grouped Rows", {{"checklist", Splitter.SplitTextByDelimiter("#(lf)", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "checklist"),
    #"Filtered Rows1" = Table.SelectRows(#"Split Column by Delimiter", each ([checklist] <> "")),
    #"Added First Column" = 
        Table.AddColumn(
            #"Filtered Rows1", 
            "First Column",
            each 
                let
                    varGroupText = Text.Start([checklist], 1)
                in
                Text.Combine(
                    Table.SelectRows(
                        Table.FromList(
                            Text.Split([All Rows][sop]{0}, "#(lf)")
                        ),
                        each Text.Start([Column1],1) = varGroupText
                    )[Column1],
                "#(lf)"
                )
            ),
    #"Added Second Column" = 
        Table.AddColumn(
            #"Added First Column", 
            "Second Column",
            each 
                let
                    varGroupText = Text.Start([checklist], 1)
                in
                Text.Combine(
                    Table.SelectRows(
                        Table.FromList(
                            Text.Split([All Rows][expected]{0}, "#(lf)")
                        ),
                        each Text.Start([Column1],1) = varGroupText
                    )[Column1],
                "#(lf)"
                )
            ),
    #"Removed Other Columns" = Table.SelectColumns(#"Added Second Column",{"num", "checklist", "First Column", "Second Column"})
in
    #"Removed Other Columns"

 

It turns this after I filtered out some blank/empty rows in the Excel file:

edhans_0-1613416349095.png

into this:

edhans_1-1613416363163.png

 

You will need to change the "Source" line to be your folder path to the test.xlsx file.

 



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

It worked on the test.xlsx file. However, when I start working on real data some rows still can work but some is not and occur an error

DataFormat.Error: There were more columns in the result than expected.
Details:
Count=1

image.png

I would have to see the data. There is nothing in my code that is expecting a specific number of columns. Change the "Second Column" code in the advanced editor this and show me what is in one of the lists that is currently returning an error. Row 10 for example, and we can work from there.

    #"Added Second Column" = 
        Table.AddColumn(
            #"Added First Column", 
            "Second Column",
            each 
                let
                    varGroupText = Text.Start([checklist], 1)
                in
                Table.SelectRows(
                    Table.FromList(
                        Text.Split([All Rows][expected]{0}, "#(lf)")
                    ),
                    each Text.Start([Column1],1) = varGroupText
                )[Column1]
        ),

edhans_0-1613442327474.png

 



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

It still return error value in List

piorsenpai_0-1613444370892.png

 

And here is the data in that cell for your information :

piorsenpai_1-1613444465803.png

 

Here you go. 

 

let
    Source = Excel.Workbook(File.Contents("C:\Users\Ed Hansberry\OneDrive\Downloads\split_test.xlsx"), null, true),
    test_Sheet = Source{[Item="test",Kind="Sheet"]}[Data],
    #"Promoted Headers" = Table.PromoteHeaders(test_Sheet, [PromoteAllScalars=true]),
    #"Filtered Rows" = Table.SelectRows(#"Promoted Headers", each ([checklist] <> null)),
    #"Grouped Rows" = Table.Group(#"Filtered Rows", {"num", "checklist"}, {{"All Rows", each _, type table [num=number, checklist=text, sop=text, expected=text]}}),
    #"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Grouped Rows", {{"checklist", Splitter.SplitTextByDelimiter("#(lf)", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "checklist"),
    #"Filtered Rows1" = Table.SelectRows(#"Split Column by Delimiter", each ([checklist] <> "")),
    #"Added First Column" = 
        Table.AddColumn(
            #"Filtered Rows1", 
            "First Column",
            each 
                let
                    varGroupText = Text.Start([checklist], 1)
                in
                Text.Combine(
                    Table.SelectRows(
                        Table.FromList(
                            Text.Split([All Rows][sop]{0}, "#(lf)"),
                            Splitter.SplitByNothing()
                        ),
                        each Text.Start([Column1],1) = varGroupText
                    )[Column1],
                "#(lf)"
                )
            ),
    #"Added Second Column" = 
        Table.AddColumn(
            #"Added First Column", 
            "Second Column",
            each 
                let
                    varGroupText = Text.Start([checklist], 1)
                in
                Text.Combine(
                    Table.SelectRows(
                        Table.FromList(
                            Text.Split([All Rows][expected]{0}, "#(lf)"),
                            Splitter.SplitByNothing()
                        ),
                        each Text.Start([Column1],1) = varGroupText
                    )[Column1],
                "#(lf)"
                )
            ),
    #"Removed Other Columns" = Table.SelectColumns(#"Added Second Column",{"num", "checklist", "First Column", "Second Column"})
in
    #"Removed Other Columns"

 

Again, change the source to your file. THe key was adding 

Splitter.SplitByNothing()

to the Table.FromList functions in each column. 

 

Apparently the unusual ASCII characters was causing Power Query to split more than you wanted, so this forces no splitting within a Table.FromList. If @ImkeF knows more I'd love her wisdom on this usage.

edhans_0-1613449374564.png

You need to be 100% sure nothing is being dropped. I cannot read that language so it isn't efficient for me to do character comparisons.

 



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

Thank you so much! I can see It worked perfectly now. Very appriciate that!

Glad it helped @piorsenpai - I learned something myself today. 😁



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

Oh, I figured it out. It was the commas in your list @piorsenpai.

 

Table.FromList automatically attempts to split text in lists where there is a comma. 

edhans_0-1613493552121.png

 

So this simple list will cause the same error:

let
    Source = {"Test", "Test, more test", "test3"},
    ConvertToTable = Table.FromList(Source)
in
    ConvertToTable

edhans_1-1613493596494.png

Same error - more columns than expected:

edhans_2-1613493684205.png

Add Splitter.SplitByNothing() as the 2nd parameter of Table.ToList fixes it.

let
    Source = {"Test", "Test, more test", "test3"},
    ConvertToTable = Table.FromList(Source, Splitter.SplitByNothing())
in
    ConvertToTable

I still don't see the logic of why Table.FromList attempts to do any parsing by default though, but it does.

This code works - and creates a 2 column table.

let
    Source = {"Test, Another", "Test, more test", "test3, final"},
    ConvertToTable = Table.FromList(Source)
in
    ConvertToTable

edhans_3-1613493965894.png

 



FYI @ImkeF 



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors