cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
eacy Regular Visitor
Regular Visitor

Split text string and reorder

Hi,

 

I have a column with accumulated time for different statuses and the number of times a ticket have been each statuses.

Each status information is separated with a "|" and the status name, number of times in status and accumulated time is separated with ":"

 

TIME_IN_STATUS
Open : 1 : 6 | In Progress : 2 : 10 | Customer Action Needed : 2 : 4 Enhancement : 1 : 5 | Pending : 1 : 10 | Rejected : 1 : 82 Open : 1 : 8 | Customer Action Needed : 2 : 28 | In Progress : 2 : 30

 

I would like to end up with the amount of columns equal to the maximum number of statuses which can occur (around 18, splited by "|")

Each column header should be the status name e.g. "Open" and the content should be the accumulated time.

 

It will be applied to around 2 million rows so it is important that it is pretty fast.

 

It is no problem to split the text string into multible columns but to create columns with all the "Open", "In Progress" etc. even though they are not placed in the same place in the string for each row.

 

Any help is appriciated

3 REPLIES 3
Super User
Super User

Re: Split text string and reorder

If I understand correctly, the sample data is the width you have shown (i.e. Not 18 columns). I don't see any ID in the data, but presumable there is one there somewhere. 

 

The first thing I would do is get all the statuses in a single column. So first get them in 3 columns and then save the query (don't load the data). I would then create 3 new queries, each query keeping one of the columns. Finally append them all together. Note this can be done in a single query, but is is more advanced. 

 

Now you say you want to have the values as column headings, so you can then Pivot the data. But note is is not the best data format for power bi. It is better to load it as a long table, then use a matrix to pivot the data. 



* Matt is a Microsoft MVP (Power BI) and author of the Power BI Book Supercharge Power BI.
eacy Regular Visitor
Regular Visitor

Re: Split text string and reorder

Hi Matt,

 

Thanks for your reply,

 

So far I have managed to manipulate my data as I want to but it seems like there is room for improvement concerning the performance of my solution. When I use my date in a chart everything is going very slow.

 

Here is my m code, please advise if you believe it can be speeded up.

 

let
    Choose = (input) => let
     values = {
     {1, "Open"},
     {3, "In Progress"},
     {4, "Reopened"},
     {5, "Resolved"},
     {6, "Closed"},
     {10000, "Emergency"},
     {10001, "Emergency Resolved"},
     {10002, "In Development Test"},
     {10003, "Customer Action Needed"},
     {10005, "In Release Test"},
     {10059, "Customer Action Needed (Resolved)"},
     {10060, "Customer Action Needed (In Progress)"},
     {10061, "Customer Action Needed (In development)"},
     {10062, "Pending Prioritization"},
     {10063, "Customer Action Needed (Pending Prioritization)"},
     {10064, "In Development"},
     {10065, "Ready For Deployment"},
     {10067, "Release Test Completed"},
     {10068, "Customer Action Needed (Ready for deployment)"},
     {10069, "In Emergency Test"},
     {10071, "Not Prioritized"},
     {10081, "Ready For Deployment"},
     {input, "Junk"}
     },
     Result = List.First(List.Select(values, each _{0}=input)){1}
   in
     Result,

    Source = MySQL.Database("jira7-db.xxx.corp", "jira", [Query="select * from customfieldvalue where customfield = 10102", ReturnSingleDatabase=true]),
    #"Renamed Columns" = Table.RenameColumns(Source,{{"TEXTVALUE", "TIME_IN_STATUS"}}),
    #"Replaced Value" = Table.ReplaceValue(#"Renamed Columns","_*|*_","|", Text.Replace, {"TIME_IN_STATUS"}),
    #"Replaced Value1" = Table.ReplaceValue(#"Replaced Value","_*:*_",":", Text.Replace, {"TIME_IN_STATUS"}),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Replaced Value1","TIME_IN_STATUS",Splitter.SplitTextByDelimiter("|", QuoteStyle.Csv),{"TIME_IN_STATUS.1", "TIME_IN_STATUS.2", "TIME_IN_STATUS.3", "TIME_IN_STATUS.4", "TIME_IN_STATUS.5", "TIME_IN_STATUS.6", "TIME_IN_STATUS.7", "TIME_IN_STATUS.8", "TIME_IN_STATUS.9", "TIME_IN_STATUS.10", "TIME_IN_STATUS.11", "TIME_IN_STATUS.12", "TIME_IN_STATUS.13", "TIME_IN_STATUS.14", "TIME_IN_STATUS.15", "TIME_IN_STATUS.16", "TIME_IN_STATUS.17", "TIME_IN_STATUS.18", "TIME_IN_STATUS.19", "TIME_IN_STATUS.20"}),
    #"Split Column by Delimiter1" = Table.SplitColumn(#"Split Column by Delimiter","TIME_IN_STATUS.1",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.1.1", "TIME_IN_STATUS.1.2", "TIME_IN_STATUS.1.3"}),
    #"Split Column by Delimiter2" = Table.SplitColumn(#"Split Column by Delimiter1","TIME_IN_STATUS.2",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.2.1", "TIME_IN_STATUS.2.2", "TIME_IN_STATUS.2.3"}),
    #"Split Column by Delimiter3" = Table.SplitColumn(#"Split Column by Delimiter2","TIME_IN_STATUS.3",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.3.1", "TIME_IN_STATUS.3.2", "TIME_IN_STATUS.3.3"}),
    #"Split Column by Delimiter4" = Table.SplitColumn(#"Split Column by Delimiter3","TIME_IN_STATUS.4",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.4.1", "TIME_IN_STATUS.4.2", "TIME_IN_STATUS.4.3"}),
    #"Split Column by Delimiter5" = Table.SplitColumn(#"Split Column by Delimiter4","TIME_IN_STATUS.5",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.5.1", "TIME_IN_STATUS.5.2", "TIME_IN_STATUS.5.3"}),
    #"Split Column by Delimiter6" = Table.SplitColumn(#"Split Column by Delimiter5","TIME_IN_STATUS.6",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.6.1", "TIME_IN_STATUS.6.2", "TIME_IN_STATUS.6.3"}),
    #"Split Column by Delimiter7" = Table.SplitColumn(#"Split Column by Delimiter6","TIME_IN_STATUS.7",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.7.1", "TIME_IN_STATUS.7.2", "TIME_IN_STATUS.7.3"}),
    #"Split Column by Delimiter8" = Table.SplitColumn(#"Split Column by Delimiter7","TIME_IN_STATUS.8",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.8.1", "TIME_IN_STATUS.8.2", "TIME_IN_STATUS.8.3"}),
    #"Split Column by Delimiter9" = Table.SplitColumn(#"Split Column by Delimiter8","TIME_IN_STATUS.9",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.9.1", "TIME_IN_STATUS.9.2", "TIME_IN_STATUS.9.3"}),
    #"Split Column by Delimiter10" = Table.SplitColumn(#"Split Column by Delimiter9","TIME_IN_STATUS.10",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.10.1", "TIME_IN_STATUS.10.2", "TIME_IN_STATUS.10.3"}),
    #"Split Column by Delimiter11" = Table.SplitColumn(#"Split Column by Delimiter10","TIME_IN_STATUS.11",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.11.1", "TIME_IN_STATUS.11.2", "TIME_IN_STATUS.11.3"}),
    #"Split Column by Delimiter12" = Table.SplitColumn(#"Split Column by Delimiter11","TIME_IN_STATUS.12",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.12.1", "TIME_IN_STATUS.12.2", "TIME_IN_STATUS.12.3"}),
    #"Split Column by Delimiter13" = Table.SplitColumn(#"Split Column by Delimiter12","TIME_IN_STATUS.13",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.13.1", "TIME_IN_STATUS.13.2", "TIME_IN_STATUS.13.3"}),
    #"Split Column by Delimiter14" = Table.SplitColumn(#"Split Column by Delimiter13","TIME_IN_STATUS.14",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.14.1", "TIME_IN_STATUS.14.2", "TIME_IN_STATUS.14.3"}),
    #"Split Column by Delimiter15" = Table.SplitColumn(#"Split Column by Delimiter14","TIME_IN_STATUS.15",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.15.1", "TIME_IN_STATUS.15.2", "TIME_IN_STATUS.15.3"}),
    #"Split Column by Delimiter16" = Table.SplitColumn(#"Split Column by Delimiter15","TIME_IN_STATUS.16",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.16.1", "TIME_IN_STATUS.16.2", "TIME_IN_STATUS.16.3"}),
    #"Split Column by Delimiter17" = Table.SplitColumn(#"Split Column by Delimiter16","TIME_IN_STATUS.17",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.17.1", "TIME_IN_STATUS.17.2", "TIME_IN_STATUS.17.3"}),
    #"Split Column by Delimiter18" = Table.SplitColumn(#"Split Column by Delimiter17","TIME_IN_STATUS.18",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.18.1", "TIME_IN_STATUS.18.2", "TIME_IN_STATUS.18.3"}),
    #"Split Column by Delimiter19" = Table.SplitColumn(#"Split Column by Delimiter18","TIME_IN_STATUS.19",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.19.1", "TIME_IN_STATUS.19.2", "TIME_IN_STATUS.19.3"}),
    #"Split Column by Delimiter20" = Table.SplitColumn(#"Split Column by Delimiter19","TIME_IN_STATUS.20",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.20.1", "TIME_IN_STATUS.20.2", "TIME_IN_STATUS.20.3"}),
    #"Added Custom1" = Table.AddColumn(#"Split Column by Delimiter20", "Open", each null),
    #"Added Custom2" = Table.AddColumn(#"Added Custom1", "Reopened", each null),
    #"Added Custom3" = Table.AddColumn(#"Added Custom2", "In Progress", each null),
    #"Added Custom4" = Table.AddColumn(#"Added Custom3", "Pending Prioritization", each null),
    #"Added Custom5" = Table.AddColumn(#"Added Custom4", "In Development", each null),
    #"Added Custom6" = Table.AddColumn(#"Added Custom5", "In Development Test", each null),
    #"Added Custom7" = Table.AddColumn(#"Added Custom6", "Emergency", each null),

etc (I have to shoorten the size)

#"Changed Type" = Table.TransformColumnTypes(#"Added Custom22",{{"TIME_IN_STATUS.1.1", Int64.Type}, {"TIME_IN_STATUS.1.2", Int64.Type}, {"TIME_IN_STATUS.1.3", Int64.Type}, {"TIME_IN_STATUS.2.1", Int64.Type}, {"TIME_IN_STATUS.2.2", Int64.Type}, {"TIME_IN_STATUS.2.3", Int64.Type}, {"TIME_IN_STATUS.3.1", Int64.Type}, {"TIME_IN_STATUS.3.2", Int64.Type}, {"TIME_IN_STATUS.3.3", Int64.Type}, {"TIME_IN_STATUS.4.1", Int64.Type}, {"TIME_IN_STATUS.4.2", Int64.Type}, {"TIME_IN_STATUS.4.3", Int64.Type}, {"TIME_IN_STATUS.5.1", Int64.Type}, {"TIME_IN_STATUS.5.2", Int64.Type}, {"TIME_IN_STATUS.5.3", Int64.Type}, {"TIME_IN_STATUS.6.1", Int64.Type}, {"TIME_IN_STATUS.6.2", Int64.Type}, {"TIME_IN_STATUS.6.3", Int64.Type}, {"TIME_IN_STATUS.7.1", Int64.Type}, {"TIME_IN_STATUS.7.2", Int64.Type}, {"TIME_IN_STATUS.7.3", Int64.Type}, {"TIME_IN_STATUS.8.1", Int64.Type}, {"TIME_IN_STATUS.8.2", Int64.Type}, {"TIME_IN_STATUS.8.3", Int64.Type}, {"TIME_IN_STATUS.9.1", Int64.Type}, {"TIME_IN_STATUS.9.2", Int64.Type}, {"TIME_IN_STATUS.9.3", Int64.Type}, {"TIME_IN_STATUS.10.1", Int64.Type}, {"TIME_IN_STATUS.10.2", Int64.Type}, {"TIME_IN_STATUS.10.3", Int64.Type}, {"TIME_IN_STATUS.11.1", Int64.Type}, {"TIME_IN_STATUS.11.2", Int64.Type}, {"TIME_IN_STATUS.11.3", Int64.Type}, {"TIME_IN_STATUS.12.1", Int64.Type}, {"TIME_IN_STATUS.12.2", Int64.Type}, {"TIME_IN_STATUS.12.3", Int64.Type}, {"TIME_IN_STATUS.13.1", Int64.Type}, {"TIME_IN_STATUS.13.2", Int64.Type}, {"TIME_IN_STATUS.13.3", Int64.Type}, {"TIME_IN_STATUS.14.1", Int64.Type}, {"TIME_IN_STATUS.14.2", Int64.Type}, {"TIME_IN_STATUS.14.3", Int64.Type}, {"TIME_IN_STATUS.15.1", Int64.Type}, {"TIME_IN_STATUS.15.2", Int64.Type}, {"TIME_IN_STATUS.15.3", Int64.Type}, {"TIME_IN_STATUS.16.1", Int64.Type}, {"TIME_IN_STATUS.16.2", Int64.Type}, {"TIME_IN_STATUS.16.3", Int64.Type}, {"TIME_IN_STATUS.17.1", Int64.Type}, {"TIME_IN_STATUS.17.2", Int64.Type}, {"TIME_IN_STATUS.17.3", Int64.Type}, {"TIME_IN_STATUS.18.1", Int64.Type}, {"TIME_IN_STATUS.18.2", Int64.Type}, {"TIME_IN_STATUS.18.3", Int64.Type}, {"TIME_IN_STATUS.19.1", Int64.Type}, {"TIME_IN_STATUS.19.2", Int64.Type}, {"TIME_IN_STATUS.19.3", Int64.Type}, {"TIME_IN_STATUS.20.1", Int64.Type}, {"TIME_IN_STATUS.20.2", Int64.Type}, {"TIME_IN_STATUS.20.3", Int64.Type}, {"DATEVALUE", Int64.Type}, {"VALUETYPE", Int64.Type}, {"Open", Int64.Type}, {"Reopened", Int64.Type}, {"In Progress", Int64.Type}, {"Pending Prioritization", Int64.Type}, {"In Development", Int64.Type}, {"In Development Test", Int64.Type}, {"Emergency", Int64.Type}, {"Emergency Resolved", Int64.Type}, {"In Emergency Test", Int64.Type}, {"In Release Test", Int64.Type}, {"Release Test Completed", Int64.Type}, {"Ready For Deployment", Int64.Type}, {"Not Prioritized", Int64.Type}, {"Customer Action Needed (In Progress)", Int64.Type}, {"Customer Action Needed (Pending Prioritization)", Int64.Type}, {"Customer Action Needed (In development)", Int64.Type}, {"Customer Action Needed (Ready for deployment)", Int64.Type}, {"Customer Action Needed (Resolved)", Int64.Type}, {"Customer Action Needed", Int64.Type}, {"Resolved", Int64.Type}, {"Closed", Int64.Type}, {"Junk", Int64.Type}}), #"Fill 1" = Table.FromRecords(Table.TransformRows(#"Changed Type", (r) => Record.TransformFields(r, {Choose(r[TIME_IN_STATUS.1.1]), each if r[TIME_IN_STATUS.1.1] <> null then r[TIME_IN_STATUS.1.3] else _}))), #"Fill 2" = Table.FromRecords(Table.TransformRows(#"Fill 1", (r) => Record.TransformFields(r, {Choose(r[TIME_IN_STATUS.2.1]), each if r[TIME_IN_STATUS.2.1] <> null then r[TIME_IN_STATUS.2.3] else _}))), #"Fill 3" = Table.FromRecords(Table.TransformRows(#"Fill 2", (r) => Record.TransformFields(r, {Choose(r[TIME_IN_STATUS.3.1]), each if r[TIME_IN_STATUS.3.1] <> null then r[TIME_IN_STATUS.3.3] else _}))),
etc. (I have to shoorten the size)

in #"Changed Type1"

 

Here is what's going on.

I have colunm which a rename to "TIME IN STATUS". this column contains a sting which holds the status number, the number of times the ticket have been in the status and the accumulated time in milliseconds.

I split the string into 20 column with the delimiter "|" because the maximum occurences is 18 so far.

Then I split each of them into 3 columns.

I create a number of columns to hold my data.

Then I use a user function to determine the different statuses in the first column and move the accumulated  time to the proper column.

In that user function I call another user function which will find the right column to copy to based on the status number.

I do that 20 times.

 

There is no order in which status is in which original column so I guess it needs to be complicated.

Super User
Super User

Re: Split text string and reorder

As I mentioned  previously, 18 (or 54) columns is not the best shape for your data. This is the performance issue - it has nothing to do with the Power Query code you posted. I suggest you unpivot your data (transform menu)

 

eg you want something like

 

date, status, time

1/1,  status 1,  5

1/1,   Status 2, 4

 

 

 

if you post a list of your column names and some sort of description (in case it is not obvious), I will help more



* Matt is a Microsoft MVP (Power BI) and author of the Power BI Book Supercharge Power BI.

Helpful resources

Announcements
New Topics Started Badges Coming

New Topics Started Badges Coming

We're releasing new versions of the badge that everyone's talking about. ;) Check your inbox for notifications.

MBAS 2020

Save the new date (and location)!

Our business applications community is growing—so we needed a different venue, resulting in a new date and location. See you there!

Difinity Conference

Difinity Conference

The largest Power BI, Power Platform, and Data conference in New Zealand

Top Solution Authors
Top Kudoed Authors (Last 30 Days)