cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Highlighted
Lori_C Frequent Visitor
Frequent Visitor

Split text string column values into multiple rows at a position relative to a delimiter

Hello, I need to split the values in the Notes column (see BEFORE) into a separate row for each note (see AFTER).  A new note begins when a timestamp occurs in the text string.  The timestamp follows this pattern and is always MDT: MM-DD-YYYY HH:MM AM (or PM) MDT.  The number of individual notes in each column value varies between 1 and 70.  The idea I have come up with, although unsuccessful in implementing, is to identify the position of each MDT occurrence in the the row and split 23 positions *before* that occurrence, which puts the date at the beginning of the new row and the note after it.  I could then extract the date into a separate column and change the data type so it's a true datetime stamp.  This is only my thought process though and I am open to any and all creative ideas that get me to the result I need in the AFTER example!  Note, using the BR tags as a delimiter is not an option with this source data.  Thank you in advance for your help and ideas!

 

BEFORE

  
Order_IDItem_IDNotes
64272217995869307-03-2018 7:52 AM MDT STATUS SAVED AS COMPLETE (NOT PERFORMED) BY KATHLEEN DOE FROM COMPANY1, LLC.<BR/>07-02-2018 12:53 PM MDT STATUS SAVED AS READY (NOT PERFORMED) BY ANN ROSE FROM COMPANY1, LLC.<BR/>07-02-2018 12:53 PM MDT VENDOR CHANGED TO: NOT PERFORMED<BR/>07-02-2018 9:50 AM MDT VIOLET GRAY ASSIGNED THIS ITEM TO ANN ROSE.<BR/>07-02-2018 9:38 AM MDT ROBERT SMITH ASSIGNED THIS ITEM TO VIOLET GRAY.<BR/>07-02-2018 5:09 AM MDT ITEM CREATED BY EDWARD JONES
64577118002872607-05-2018 10:56 AM MDT STATUS SAVED AS COMPLETE (YES) BY JUSTIN TIME FROM COMPANY1, LLC.<BR/>07-03-2018 1:06 PM MDT STATUS SAVED AS READY (YES) BY DOROTHY TOTO FROM COMPANY1, LLC.<BR/>07-03-2018 1:06 PM MDT DOROTHY TOTO: <BR>EMPLOYMENT VERIFICATION PROVIDED PAM N., HUMAN RESOURCES REP&NBSP;<BR>07-03-2018 7:30 AM MDT OSCAR VILAS ASSIGNED THIS ITEM TO DOROTHY TOTO.<BR/>07-03-2018 7:20 AM MDT VIOLET GRAY ASSIGNED THIS ITEM TO OSCAR VILAS.<BR/>07-02-2018 8:45 PM MDT JOHN DOE ASSIGNED THIS ITEM TO VIOLET GRAY.<BR/>07-02-2018 3:32 PM MDT DINO SAUR ASSIGNED THIS ITEM TO JOHN DOE.<BR/>07-02-2018 1:49 PM MDT JOHN DOE ASSIGNED THIS ITEM TO DINO SAUR.<BR/>07-02-2018 1:31 PM MDT JULIA CHILDS ASSIGNED THIS ITEM TO JOHN DOE.<BR/>07-02-2018 1:31 PM MDT STATUS SAVED AS PENDING (PENDING) BY JULIA CHILDS FROM COMPANY1, LLC.<BR/>07-02-2018 1:31 PM MDT VENDOR CHANGED TO: ATTEMPT #1 - PHONE ATTEMPT<BR/>07-02-2018 1:30 PM MDT JULIA CHILDS: <BR>CALL FOR PROCESS

 

 

AFTER   
Order_IDItem_IDDatetime stampNote
64272217995869307-03-2018 7:52 AM STATUS SAVED AS COMPLETE (NOT PERFORMED) BY KATHLEEN DOE FROM COMPANY1, LLC.<BR/>
64272217995869307-02-2018 12:53 PMSTATUS SAVED AS READY (NOT PERFORMED) BY ANN ROSE FROM COMPANY1, LLC.<BR/>
64272217995869307-02-2018 12:53 PMVENDOR CHANGED TO: NOT PERFORMED<BR/>
64272217995869307-02-2018 9:50 AM VIOLET GRAY ASSIGNED THIS ITEM TO ANN ROSE.<BR/>
64272217995869307-02-2018 9:38 AM ROBERT SMITH ASSIGNED THIS ITEM TO VIOLET GRAY.<BR/>
64272217995869307-02-2018 5:09 AM ITEM CREATED BY EDWARD JONES
64577118002872607-05-2018 10:56 AMSTATUS SAVED AS COMPLETE (YES) BY JUSTIN TIME FROM COMPANY1, LLC.<BR/>
64577118002872607-03-2018 1:06 PM STATUS SAVED AS READY (YES) BY DOROTHY TOTO FROM COMPANY1, LLC.<BR/>
64577118002872607-03-2018 1:06 PM DOROTHY TOTO: <BR>EMPLOYMENT VERIFICATION PROVIDED PAM N., HUMAN RESOURCES REP&NBSP;<BR>
64577118002872607-03-2018 7:30 AM OSCAR VILAS ASSIGNED THIS ITEM TO DOROTHY TOTO.<BR/>
64577118002872607-03-2018 7:20 AM VIOLET GRAY ASSIGNED THIS ITEM TO OSCAR VILAS.<BR/>
64577118002872607-02-2018 8:45 PM JOHN DOE ASSIGNED THIS ITEM TO VIOLET GRAY.<BR/>
64577118002872607-02-2018 3:32 PM DINO SAUR ASSIGNED THIS ITEM TO JOHN DOE.<BR/>
64577118002872607-02-2018 1:49 PM JOHN DOE ASSIGNED THIS ITEM TO DINO SAUR.<BR/>
64577118002872607-02-2018 1:31 PM JULIA CHILDS ASSIGNED THIS ITEM TO JOHN DOE.<BR/>
64577118002872607-02-2018 1:31 PM STATUS SAVED AS PENDING (PENDING) BY JULIA CHILDS FROM COMPANY1, LLC.<BR/>
64577118002872607-02-2018 1:31 PM VENDOR CHANGED TO: ATTEMPT #1 - PHONE ATTEMPT<BR/>
64577118002872607-02-2018 1:30 PM JULIA CHILDS: <BR>CALL FOR PROCESS
3 REPLIES 3
Super User
Super User

Re: Split text string column values into multiple rows at a position relative to a delimiter

See if this works:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("pVTbcpswFPyVM+5LOmOnQpib8iSDbOQgiZGEM0yap04/of/fgw0UZzBxmjcNg3ZXu3vO6+sq3tKE0tV6FSRZFqVxFuKZJBsSbigJUkhYRIErUIUH57lvHDh+EgVwB7lRdSW8gAdtPNTC7o1VovgOuxaeuS8rITQURsDeGnX+m+s2WENV5Y8//xAS/trZH+fD746RXhgDyqIQ6nlKK3jRzvFxrcEa9xWuk9CFsZCXXB+QzRsGVzy3YTIWkcGkkzToCRwsR1HOyYPusErpQHqhEHWUuqArY2E6AFqzExadUNKXNxAnnAugESPZAHq+maObHrHQPlG8cFvA0WjhVm/rrhdRkgRdL1JCaJrQuO9F1DtHWBR/XIxWuHM8x8Z5qcFLdW9Cff8CRuIPyjBwYHjGly0agpb8N8kUhcFw53JF4LNMq4TuymLlXubcS6OhtuYkCxRVoyH6cQ1loziGLJxpbC46pXWHQGO9c/XTNejVsIVjj4zLucVkK3zofOpTpQuPSxj9RDkntAtVStk2Ggw7mvIy5V/pZshCOiYgtcGQG3sDcSBcmmu2ze7UN7ItwoXBCNdUkuOSkFVxK5i7BP5DfF/rGveQ1Ad46A/9AE1oP7fjJlQzK457VF17+BbABuoSF8DwaRGSzPnxfl5yXlWAy7MbEBwD3CxvfwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Order_ID = _t, Item_ID = _t, Notes = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Order_ID", Int64.Type}, {"Item_ID", Int64.Type}, {"Notes", type text}}),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Changed Type", "Notes", Splitter.SplitTextByDelimiter("<BR/>", QuoteStyle.Csv), {"Notes.1", "Notes.2", "Notes.3", "Notes.4", "Notes.5", "Notes.6", "Notes.7", "Notes.8", "Notes.9", "Notes.10", "Notes.11"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Notes.1", type text}, {"Notes.2", type text}, {"Notes.3", type text}, {"Notes.4", type text}, {"Notes.5", type text}, {"Notes.6", type text}, {"Notes.7", type text}, {"Notes.8", type text}, {"Notes.9", type text}, {"Notes.10", type text}, {"Notes.11", type text}}),
    #"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Changed Type1", {"Order_ID", "Item_ID"}, "Attribute", "Value"),
    #"Removed Columns" = Table.RemoveColumns(#"Unpivoted Columns",{"Attribute"}),
    #"Split Column by Delimiter1" = Table.SplitColumn(#"Removed Columns", "Value", Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.Csv, false), {"Value.1", "Value.2"}),
    #"Changed Type2" = Table.TransformColumnTypes(#"Split Column by Delimiter1",{{"Value.1", type date}, {"Value.2", type text}}),
    #"Split Column by Delimiter2" = Table.SplitColumn(#"Changed Type2", "Value.2", Splitter.SplitTextByEachDelimiter({"MDT"}, QuoteStyle.Csv, false), {"Value.2.1", "Value.2.2"}),
    #"Changed Type3" = Table.TransformColumnTypes(#"Split Column by Delimiter2",{{"Value.2.1", type time}, {"Value.2.2", type text}})
in
    #"Changed Type3"

Check-out my Back to School contest submission: Dinosaurs!

Did I answer your question? Mark my post as a solution!

Proud to be a Datanaut!

v-danhe-msft Super Contributor
Super Contributor

Re: Split text string column values into multiple rows at a position relative to a delimiter

Hi @Lori_C,

Based on my test, you can refer to below steps:

1.Use the “Split Column by Delimiter” function to different notes.

A.PNG

2.Unpivot the columns, except the [Order_ID] and the [Item_ID].

B.PNG

3.Remove the [Attribute] column and use the “Split Column by Delimiter” function again to achieve the correct result.

C.PNG

You can also download the PBIX file to have a view.

https://www.dropbox.com/s/wrut5tvgy8xfqc3/Split%20text%20string%20column%20values%20into%20multiple%...

 

Regards,

Daniel He

Community Support Team _ Daniel He
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Lori_C Frequent Visitor
Frequent Visitor

Re: Split text string column values into multiple rows at a position relative to a delimiter

Hi Greg and Daniel, thank you both for the suggestions!  Unfortunately they don't work** because of inconsistent break tags in the source data.  The actual data is much uglier than the sanitzed example I posted (shocking, right?).  Each string is littered with breaks (<BR>, <BR/>, <BR>) based on operator entry styles which vary, and multiple operators appear in the same string.  I've tried splitting using the various break tags separately and in different ordered combinations but the results aren't clean enough; also replacing the various tags with a single delimiter but I get too many splits where I don't want them.

 

**Greg I should add, I haven't gotten your code to work - I'm brand new to M and haven't figured out how to replace the posted table with my actual column name - but going through it conceptually and seeing Splitter.SplitTextByDelimiter("<BR/>" makes me think yours relies on the break tag as well.  Please correct me if I'm wrong about that!

 

I keep coming back to my "true" delimiter being the datetime stamp because it's application-generated and always appears at the beginning of each manual entry.  My SAS brain tell me to get the position of every instance of MDT and then count back 19 or 20 characters (which points to the beginning of the date) and say "split here"!  Or to split at the MM of anything formatted like MM-DD-YYYY HH:MM AM (or PM) MDT.  I just can't figure out how to tell this to Get-and-Transform.  Any thoughts?  Thanks much!       

Helpful resources

Announcements
Virtual Launch Event

Microsoft Business Applications October Virtual Launch Event

Join us for an in-depth look at the new innovations across Dynamics 365 and the Microsoft Power Platform.

Summit North America

Power Platform Summit North America

Register by September 5 to save $200

PBI Community Highlights

PBI Community Highlights

Check out what's new in the Power BI Community!

MBAS Gallery

Watch Sessions On Demand!

Continue your learning in our online communities.

Top Kudoed Authors
Users Online
Currently online: 96 members 1,628 guests
Please welcome our newest community members: