Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.
Hi,
There are many files containing several sentences in a single excell cell.
This is simplified sample data.
I'm wondering how to split them to the desired dataset.
In actual data, Q(x) & A(x) are sentences.
So I think doulbe line feed will be separator, but difficult to keep coordination of each Q&A.
Does anyone have a good idea??
Solved! Go to Solution.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("XYy7DQAhDEN3SU3DJ4Q2I1CHrMD+5clpcqKxbD/LZlSp0IZoJS9G7ckdGaW2yAO5n3vuhtWwOoIxGCfjH5vPj+TPDJV8i0Ildgs7iC5y/wA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"org index" = _t, question = _t, answer = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"org index", Int64.Type}, {"question", type text}, {"answer", type text}}),
#"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"org index"}, "Attribute", "Value"),
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Unpivoted Columns", {{"Value", Splitter.SplitTextByDelimiter("#(lf)#(lf)", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Value"),
Questions = Table.SelectRows(#"Split Column by Delimiter", each ([Attribute] = "question")),
Answers = Table.SelectRows(#"Split Column by Delimiter", each ([Attribute] = "answer")),
Combine = Table.FromColumns({Questions[org index], Questions[Value], Answers[Value]}, {"org index", "question", "answer"})
in
Combine
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("XYy7DQAhDEN3SU3DJ4Q2I1CHrMD+5clpcqKxbD/LZlSp0IZoJS9G7ckdGaW2yAO5n3vuhtWwOoIxGCfjH5vPj+TPDJV8i0Ildgs7iC5y/wA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"org index" = _t, question = _t, answer = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"org index", Int64.Type}, {"question", type text}, {"answer", type text}}),
#"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"org index"}, "Attribute", "Value"),
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Unpivoted Columns", {{"Value", Splitter.SplitTextByDelimiter("#(lf)#(lf)", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Value"),
Questions = Table.SelectRows(#"Split Column by Delimiter", each ([Attribute] = "question")),
Answers = Table.SelectRows(#"Split Column by Delimiter", each ([Attribute] = "answer")),
Combine = Table.FromColumns({Questions[org index], Questions[Value], Answers[Value]}, {"org index", "question", "answer"})
in
Combine
Very beautiful!
I was thinking to split to 2 queries and merge them.
Your solution perfectly works without making unneccesary query.
Thank you for sophisticated idea.
Hey,
Agree with previous suggested answer, that you should split Answers and Questions.
Step one is to make two tables one would be [Org Index] & [Question] and another [Org Index ] & [Answer].
Then Split by delimeter.
Important point, when you split by delimeter column go to into advanced options
Select Rows
Then add index to both tables and merge on index.
Then it will split into rows your answers.
Then add index.
Hi,
you can import the file to power query (before doing it i have cleaned the useless column and row but you can do it in power query),
remove the column of the answer (we get it back later),
clean text on column question then split column by numer of characters (SplitTextByRepeatedLengths(2)).
Select the org index column and unpivot other columns.
Now duplicate the value column you have obtained and, in the duplicated column, replace values Q with A.
Remove the column attribute, change names and it's done.
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.