Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hello everyone,
I have a dataframe that looks as follows:
I want to split multiple row to multiple row by line break and first character each line.
Here is the result I want:
I have listed all the case that appear on table above.
Here is excel file for more convenient:
https://1drv.ms/x/s!AgaDgBBM4rAGli_UZM0WlmdiT7wj?e=L8WpTG
Thank you!
Solved! Go to Solution.
Here you go.
let
Source = Excel.Workbook(File.Contents("C:\Users\Ed Hansberry\OneDrive\Downloads\split_test.xlsx"), null, true),
test_Sheet = Source{[Item="test",Kind="Sheet"]}[Data],
#"Promoted Headers" = Table.PromoteHeaders(test_Sheet, [PromoteAllScalars=true]),
#"Filtered Rows" = Table.SelectRows(#"Promoted Headers", each ([checklist] <> null)),
#"Grouped Rows" = Table.Group(#"Filtered Rows", {"num", "checklist"}, {{"All Rows", each _, type table [num=number, checklist=text, sop=text, expected=text]}}),
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Grouped Rows", {{"checklist", Splitter.SplitTextByDelimiter("#(lf)", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "checklist"),
#"Filtered Rows1" = Table.SelectRows(#"Split Column by Delimiter", each ([checklist] <> "")),
#"Added First Column" =
Table.AddColumn(
#"Filtered Rows1",
"First Column",
each
let
varGroupText = Text.Start([checklist], 1)
in
Text.Combine(
Table.SelectRows(
Table.FromList(
Text.Split([All Rows][sop]{0}, "#(lf)"),
Splitter.SplitByNothing()
),
each Text.Start([Column1],1) = varGroupText
)[Column1],
"#(lf)"
)
),
#"Added Second Column" =
Table.AddColumn(
#"Added First Column",
"Second Column",
each
let
varGroupText = Text.Start([checklist], 1)
in
Text.Combine(
Table.SelectRows(
Table.FromList(
Text.Split([All Rows][expected]{0}, "#(lf)"),
Splitter.SplitByNothing()
),
each Text.Start([Column1],1) = varGroupText
)[Column1],
"#(lf)"
)
),
#"Removed Other Columns" = Table.SelectColumns(#"Added Second Column",{"num", "checklist", "First Column", "Second Column"})
in
#"Removed Other Columns"
Again, change the source to your file. THe key was adding
Splitter.SplitByNothing()
to the Table.FromList functions in each column.
Apparently the unusual ASCII characters was causing Power Query to split more than you wanted, so this forces no splitting within a Table.FromList. If @ImkeF knows more I'd love her wisdom on this usage.
You need to be 100% sure nothing is being dropped. I cannot read that language so it isn't efficient for me to do character comparisons.
DAX is for Analysis. Power Query is for Data Modeling
Proud to be a Super User!
MCSA: BI ReportingHello everyone,
I have a dataframe that looks as follows:
num | checklist | sop | expected |
1 | A. (text) B. (text) C. (text) D. (text) | A. (text) B. (text) C. (text) D. (text) | A. (text) B. (text) C. (text) D. (text) |
2 | A. (text) B. (text) C. (text) | A 1. (text) A 2. (text) B 1. (text) B 2. (text) C. (text) | A 1. (text) A 2. (text) B 1-1. (text) B 1-2. (text) B 2. (test) C 1. (text) C 2. (text) |
3 | A. (text) B. (text) C. (text) | A. (text) B 1. (text) B 2. (text) C. (text) | A. (text) C. (text)
|
I want to split multiple row to multiple row by line break and first character each line.
Here is the result I want:
num | checklist | sop | expected |
1 | A. (text) | A. (text) | A. (text) |
1 | B. (text) | B. (text) | B. (text) |
1 | C. (text) | C. (text) | C. (text) |
1 | D. (text) | D. (text) | D. (text) |
2 | A. (text) | A 1. (text) A 2. (text) | A 1. (text) A 2. (text) |
2 | B. (text) | B 1. (text) B 2. (text) | B 1-1. (text) B 1-2. (text) B 2. (test) |
2 | C. (text) | C. (text) | C 1. (text) C 2. (text) |
3 | A. (text) | A. (text) | A. (text) |
3 | B. (text) | B 1. (text) B 2. (text) |
|
3 | C. (text) | C. (text) | C. (text) |
Here is excel file for more convenient:
https://1drv.ms/x/s!AgaDgBBM4rAGli_UZM0WlmdiT7wj?e=L8WpTG
I have listed all the case that appear on table above.
Thank you!
I cannot tell what you are wanting based on what you have posted @piorsenpai - that is just a lot of text that when I paste it into Excel or a text editor, it is one long string.
But if you want to break based on line feeds, you can use the Split Column feature and use special characters.
It turns this:
into this:
If you need more help, I think you are going to have to share an Excel file via OneDrive or Dropbox with real sample data. Pasting text into the forum will destroy any of the special characters you are trying to parse. That includes your expected results. It looks a bit jumbled to me based on what the forum did with it.
DAX is for Analysis. Power Query is for Data Modeling
Proud to be a Super User!
MCSA: BI ReportingSorry for that inconvenient,
Here is excel file which I added Data sample and Result that I want in 2 sheets:
https://1drv.ms/x/s!AgaDgBBM4rAGli_UZM0WlmdiT7wj?e=rG2tqQ
If possible, I hope I can have a method for multiple row like that not just 3 rows in file excel sample because the file which I am working on have thousands of row like that.
Ok, see if this works. It turns this:
into this:
This is the full code:
let
Source = Excel.Workbook(File.Contents("C:\Users\Ed Hansberry\OneDrive\Downloads\test.xlsx"), null, true),
Data_Sheet = Source{[Item="Data",Kind="Sheet"]}[Data],
#"Promoted Headers" = Table.PromoteHeaders(Data_Sheet, [PromoteAllScalars=true]),
#"Filtered Rows" = Table.SelectRows(#"Promoted Headers", each ([checklist] <> null)),
#"Grouped Rows" = Table.Group(#"Filtered Rows", {"num", "checklist"}, {{"All Rows", each _, type table [num=number, checklist=text, sop=text, expected=text]}}),
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Grouped Rows", {{"checklist", Splitter.SplitTextByDelimiter("#(lf)", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "checklist"),
#"Filtered Rows1" = Table.SelectRows(#"Split Column by Delimiter", each ([checklist] <> "")),
#"Added First Column" =
Table.AddColumn(
#"Filtered Rows1",
"First Column",
each
let
varGroupText = Text.Start([checklist], 1)
in
Text.Combine(
Table.SelectRows(
Table.FromList(
Text.Split([All Rows][sop]{0}, "#(lf)")
),
each Text.Start([Column1],1) = varGroupText
)[Column1],
"#(lf)"
)
),
#"Added Second Column" =
Table.AddColumn(
#"Added First Column",
"Second Column",
each
let
varGroupText = Text.Start([checklist], 1)
in
Text.Combine(
Table.SelectRows(
Table.FromList(
Text.Split([All Rows][expected]{0}, "#(lf)")
),
each Text.Start([Column1],1) = varGroupText
)[Column1],
"#(lf)"
)
),
#"Removed Other Columns" = Table.SelectColumns(#"Added Second Column",{"num", "checklist", "First Column", "Second Column"})
in
#"Removed Other Columns"
You will need to change your path in the SOURCE line for it to connect to your test.xlsx file on your hard drive.
DAX is for Analysis. Power Query is for Data Modeling
Proud to be a Super User!
MCSA: BI ReportingHey @edhans ,
ha, ha, that's pretty mean.
Unfortunately I also don't have any idea what the undocumented default-splitter function is.
Also have no idea why someone considered it not to be Splitter.SplitByNothing 😉
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
See this code @piorsenpai
let
Source = Excel.Workbook(File.Contents("C:\Users\Ed Hansberry\OneDrive\Downloads\test.xlsx"), null, true),
Data_Sheet = Source{[Item="Data",Kind="Sheet"]}[Data],
#"Promoted Headers" = Table.PromoteHeaders(Data_Sheet, [PromoteAllScalars=true]),
#"Filtered Rows" = Table.SelectRows(#"Promoted Headers", each ([checklist] <> null)),
#"Grouped Rows" = Table.Group(#"Filtered Rows", {"num", "checklist"}, {{"All Rows", each _, type table [num=number, checklist=text, sop=text, expected=text]}}),
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Grouped Rows", {{"checklist", Splitter.SplitTextByDelimiter("#(lf)", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "checklist"),
#"Filtered Rows1" = Table.SelectRows(#"Split Column by Delimiter", each ([checklist] <> "")),
#"Added First Column" =
Table.AddColumn(
#"Filtered Rows1",
"First Column",
each
let
varGroupText = Text.Start([checklist], 1)
in
Text.Combine(
Table.SelectRows(
Table.FromList(
Text.Split([All Rows][sop]{0}, "#(lf)")
),
each Text.Start([Column1],1) = varGroupText
)[Column1],
"#(lf)"
)
),
#"Added Second Column" =
Table.AddColumn(
#"Added First Column",
"Second Column",
each
let
varGroupText = Text.Start([checklist], 1)
in
Text.Combine(
Table.SelectRows(
Table.FromList(
Text.Split([All Rows][expected]{0}, "#(lf)")
),
each Text.Start([Column1],1) = varGroupText
)[Column1],
"#(lf)"
)
),
#"Removed Other Columns" = Table.SelectColumns(#"Added Second Column",{"num", "checklist", "First Column", "Second Column"})
in
#"Removed Other Columns"
It turns this after I filtered out some blank/empty rows in the Excel file:
into this:
You will need to change the "Source" line to be your folder path to the test.xlsx file.
DAX is for Analysis. Power Query is for Data Modeling
Proud to be a Super User!
MCSA: BI ReportingIt worked on the test.xlsx file. However, when I start working on real data some rows still can work but some is not and occur an error
DataFormat.Error: There were more columns in the result than expected.
Details:
Count=1
I would have to see the data. There is nothing in my code that is expecting a specific number of columns. Change the "Second Column" code in the advanced editor this and show me what is in one of the lists that is currently returning an error. Row 10 for example, and we can work from there.
#"Added Second Column" =
Table.AddColumn(
#"Added First Column",
"Second Column",
each
let
varGroupText = Text.Start([checklist], 1)
in
Table.SelectRows(
Table.FromList(
Text.Split([All Rows][expected]{0}, "#(lf)")
),
each Text.Start([Column1],1) = varGroupText
)[Column1]
),
DAX is for Analysis. Power Query is for Data Modeling
Proud to be a Super User!
MCSA: BI ReportingIt still return error value in List
And here is the data in that cell for your information :
Here you go.
let
Source = Excel.Workbook(File.Contents("C:\Users\Ed Hansberry\OneDrive\Downloads\split_test.xlsx"), null, true),
test_Sheet = Source{[Item="test",Kind="Sheet"]}[Data],
#"Promoted Headers" = Table.PromoteHeaders(test_Sheet, [PromoteAllScalars=true]),
#"Filtered Rows" = Table.SelectRows(#"Promoted Headers", each ([checklist] <> null)),
#"Grouped Rows" = Table.Group(#"Filtered Rows", {"num", "checklist"}, {{"All Rows", each _, type table [num=number, checklist=text, sop=text, expected=text]}}),
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Grouped Rows", {{"checklist", Splitter.SplitTextByDelimiter("#(lf)", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "checklist"),
#"Filtered Rows1" = Table.SelectRows(#"Split Column by Delimiter", each ([checklist] <> "")),
#"Added First Column" =
Table.AddColumn(
#"Filtered Rows1",
"First Column",
each
let
varGroupText = Text.Start([checklist], 1)
in
Text.Combine(
Table.SelectRows(
Table.FromList(
Text.Split([All Rows][sop]{0}, "#(lf)"),
Splitter.SplitByNothing()
),
each Text.Start([Column1],1) = varGroupText
)[Column1],
"#(lf)"
)
),
#"Added Second Column" =
Table.AddColumn(
#"Added First Column",
"Second Column",
each
let
varGroupText = Text.Start([checklist], 1)
in
Text.Combine(
Table.SelectRows(
Table.FromList(
Text.Split([All Rows][expected]{0}, "#(lf)"),
Splitter.SplitByNothing()
),
each Text.Start([Column1],1) = varGroupText
)[Column1],
"#(lf)"
)
),
#"Removed Other Columns" = Table.SelectColumns(#"Added Second Column",{"num", "checklist", "First Column", "Second Column"})
in
#"Removed Other Columns"
Again, change the source to your file. THe key was adding
Splitter.SplitByNothing()
to the Table.FromList functions in each column.
Apparently the unusual ASCII characters was causing Power Query to split more than you wanted, so this forces no splitting within a Table.FromList. If @ImkeF knows more I'd love her wisdom on this usage.
You need to be 100% sure nothing is being dropped. I cannot read that language so it isn't efficient for me to do character comparisons.
DAX is for Analysis. Power Query is for Data Modeling
Proud to be a Super User!
MCSA: BI ReportingThank you so much! I can see It worked perfectly now. Very appriciate that!
Glad it helped @piorsenpai - I learned something myself today. 😁
DAX is for Analysis. Power Query is for Data Modeling
Proud to be a Super User!
MCSA: BI ReportingOh, I figured it out. It was the commas in your list @piorsenpai.
Table.FromList automatically attempts to split text in lists where there is a comma.
So this simple list will cause the same error:
let
Source = {"Test", "Test, more test", "test3"},
ConvertToTable = Table.FromList(Source)
in
ConvertToTable
Same error - more columns than expected:
Add Splitter.SplitByNothing() as the 2nd parameter of Table.ToList fixes it.
let
Source = {"Test", "Test, more test", "test3"},
ConvertToTable = Table.FromList(Source, Splitter.SplitByNothing())
in
ConvertToTable
I still don't see the logic of why Table.FromList attempts to do any parsing by default though, but it does.
This code works - and creates a 2 column table.
let
Source = {"Test, Another", "Test, more test", "test3, final"},
ConvertToTable = Table.FromList(Source)
in
ConvertToTable
FYI @ImkeF
DAX is for Analysis. Power Query is for Data Modeling
Proud to be a Super User!
MCSA: BI ReportingCovering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.