Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
eacy
Helper II
Helper II

Split text string and reorder

Hi,

 

I have a column with accumulated time for different statuses and the number of times a ticket have been each statuses.

Each status information is separated with a "|" and the status name, number of times in status and accumulated time is separated with ":"

 

TIME_IN_STATUS
Open : 1 : 6 | In Progress : 2 : 10 | Customer Action Needed : 2 : 4 Enhancement : 1 : 5 | Pending : 1 : 10 | Rejected : 1 : 82 Open : 1 : 8 | Customer Action Needed : 2 : 28 | In Progress : 2 : 30

 

I would like to end up with the amount of columns equal to the maximum number of statuses which can occur (around 18, splited by "|")

Each column header should be the status name e.g. "Open" and the content should be the accumulated time.

 

It will be applied to around 2 million rows so it is important that it is pretty fast.

 

It is no problem to split the text string into multible columns but to create columns with all the "Open", "In Progress" etc. even though they are not placed in the same place in the string for each row.

 

Any help is appriciated

3 REPLIES 3

If I understand correctly, the sample data is the width you have shown (i.e. Not 18 columns). I don't see any ID in the data, but presumable there is one there somewhere. 

 

The first thing I would do is get all the statuses in a single column. So first get them in 3 columns and then save the query (don't load the data). I would then create 3 new queries, each query keeping one of the columns. Finally append them all together. Note this can be done in a single query, but is is more advanced. 

 

Now you say you want to have the values as column headings, so you can then Pivot the data. But note is is not the best data format for power bi. It is better to load it as a long table, then use a matrix to pivot the data. 



* Matt is an 8 times Microsoft MVP (Power BI) and author of the Power BI Book Supercharge Power BI.

Hi Matt,

 

Thanks for your reply,

 

So far I have managed to manipulate my data as I want to but it seems like there is room for improvement concerning the performance of my solution. When I use my date in a chart everything is going very slow.

 

Here is my m code, please advise if you believe it can be speeded up.

 

let
    Choose = (input) => let
     values = {
     {1, "Open"},
     {3, "In Progress"},
     {4, "Reopened"},
     {5, "Resolved"},
     {6, "Closed"},
     {10000, "Emergency"},
     {10001, "Emergency Resolved"},
     {10002, "In Development Test"},
     {10003, "Customer Action Needed"},
     {10005, "In Release Test"},
     {10059, "Customer Action Needed (Resolved)"},
     {10060, "Customer Action Needed (In Progress)"},
     {10061, "Customer Action Needed (In development)"},
     {10062, "Pending Prioritization"},
     {10063, "Customer Action Needed (Pending Prioritization)"},
     {10064, "In Development"},
     {10065, "Ready For Deployment"},
     {10067, "Release Test Completed"},
     {10068, "Customer Action Needed (Ready for deployment)"},
     {10069, "In Emergency Test"},
     {10071, "Not Prioritized"},
     {10081, "Ready For Deployment"},
     {input, "Junk"}
     },
     Result = List.First(List.Select(values, each _{0}=input)){1}
   in
     Result,

    Source = MySQL.Database("jira7-db.xxx.corp", "jira", [Query="select * from customfieldvalue where customfield = 10102", ReturnSingleDatabase=true]),
    #"Renamed Columns" = Table.RenameColumns(Source,{{"TEXTVALUE", "TIME_IN_STATUS"}}),
    #"Replaced Value" = Table.ReplaceValue(#"Renamed Columns","_*|*_","|", Text.Replace, {"TIME_IN_STATUS"}),
    #"Replaced Value1" = Table.ReplaceValue(#"Replaced Value","_*:*_",":", Text.Replace, {"TIME_IN_STATUS"}),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Replaced Value1","TIME_IN_STATUS",Splitter.SplitTextByDelimiter("|", QuoteStyle.Csv),{"TIME_IN_STATUS.1", "TIME_IN_STATUS.2", "TIME_IN_STATUS.3", "TIME_IN_STATUS.4", "TIME_IN_STATUS.5", "TIME_IN_STATUS.6", "TIME_IN_STATUS.7", "TIME_IN_STATUS.8", "TIME_IN_STATUS.9", "TIME_IN_STATUS.10", "TIME_IN_STATUS.11", "TIME_IN_STATUS.12", "TIME_IN_STATUS.13", "TIME_IN_STATUS.14", "TIME_IN_STATUS.15", "TIME_IN_STATUS.16", "TIME_IN_STATUS.17", "TIME_IN_STATUS.18", "TIME_IN_STATUS.19", "TIME_IN_STATUS.20"}),
    #"Split Column by Delimiter1" = Table.SplitColumn(#"Split Column by Delimiter","TIME_IN_STATUS.1",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.1.1", "TIME_IN_STATUS.1.2", "TIME_IN_STATUS.1.3"}),
    #"Split Column by Delimiter2" = Table.SplitColumn(#"Split Column by Delimiter1","TIME_IN_STATUS.2",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.2.1", "TIME_IN_STATUS.2.2", "TIME_IN_STATUS.2.3"}),
    #"Split Column by Delimiter3" = Table.SplitColumn(#"Split Column by Delimiter2","TIME_IN_STATUS.3",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.3.1", "TIME_IN_STATUS.3.2", "TIME_IN_STATUS.3.3"}),
    #"Split Column by Delimiter4" = Table.SplitColumn(#"Split Column by Delimiter3","TIME_IN_STATUS.4",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.4.1", "TIME_IN_STATUS.4.2", "TIME_IN_STATUS.4.3"}),
    #"Split Column by Delimiter5" = Table.SplitColumn(#"Split Column by Delimiter4","TIME_IN_STATUS.5",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.5.1", "TIME_IN_STATUS.5.2", "TIME_IN_STATUS.5.3"}),
    #"Split Column by Delimiter6" = Table.SplitColumn(#"Split Column by Delimiter5","TIME_IN_STATUS.6",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.6.1", "TIME_IN_STATUS.6.2", "TIME_IN_STATUS.6.3"}),
    #"Split Column by Delimiter7" = Table.SplitColumn(#"Split Column by Delimiter6","TIME_IN_STATUS.7",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.7.1", "TIME_IN_STATUS.7.2", "TIME_IN_STATUS.7.3"}),
    #"Split Column by Delimiter8" = Table.SplitColumn(#"Split Column by Delimiter7","TIME_IN_STATUS.8",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.8.1", "TIME_IN_STATUS.8.2", "TIME_IN_STATUS.8.3"}),
    #"Split Column by Delimiter9" = Table.SplitColumn(#"Split Column by Delimiter8","TIME_IN_STATUS.9",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.9.1", "TIME_IN_STATUS.9.2", "TIME_IN_STATUS.9.3"}),
    #"Split Column by Delimiter10" = Table.SplitColumn(#"Split Column by Delimiter9","TIME_IN_STATUS.10",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.10.1", "TIME_IN_STATUS.10.2", "TIME_IN_STATUS.10.3"}),
    #"Split Column by Delimiter11" = Table.SplitColumn(#"Split Column by Delimiter10","TIME_IN_STATUS.11",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.11.1", "TIME_IN_STATUS.11.2", "TIME_IN_STATUS.11.3"}),
    #"Split Column by Delimiter12" = Table.SplitColumn(#"Split Column by Delimiter11","TIME_IN_STATUS.12",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.12.1", "TIME_IN_STATUS.12.2", "TIME_IN_STATUS.12.3"}),
    #"Split Column by Delimiter13" = Table.SplitColumn(#"Split Column by Delimiter12","TIME_IN_STATUS.13",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.13.1", "TIME_IN_STATUS.13.2", "TIME_IN_STATUS.13.3"}),
    #"Split Column by Delimiter14" = Table.SplitColumn(#"Split Column by Delimiter13","TIME_IN_STATUS.14",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.14.1", "TIME_IN_STATUS.14.2", "TIME_IN_STATUS.14.3"}),
    #"Split Column by Delimiter15" = Table.SplitColumn(#"Split Column by Delimiter14","TIME_IN_STATUS.15",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.15.1", "TIME_IN_STATUS.15.2", "TIME_IN_STATUS.15.3"}),
    #"Split Column by Delimiter16" = Table.SplitColumn(#"Split Column by Delimiter15","TIME_IN_STATUS.16",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.16.1", "TIME_IN_STATUS.16.2", "TIME_IN_STATUS.16.3"}),
    #"Split Column by Delimiter17" = Table.SplitColumn(#"Split Column by Delimiter16","TIME_IN_STATUS.17",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.17.1", "TIME_IN_STATUS.17.2", "TIME_IN_STATUS.17.3"}),
    #"Split Column by Delimiter18" = Table.SplitColumn(#"Split Column by Delimiter17","TIME_IN_STATUS.18",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.18.1", "TIME_IN_STATUS.18.2", "TIME_IN_STATUS.18.3"}),
    #"Split Column by Delimiter19" = Table.SplitColumn(#"Split Column by Delimiter18","TIME_IN_STATUS.19",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.19.1", "TIME_IN_STATUS.19.2", "TIME_IN_STATUS.19.3"}),
    #"Split Column by Delimiter20" = Table.SplitColumn(#"Split Column by Delimiter19","TIME_IN_STATUS.20",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"TIME_IN_STATUS.20.1", "TIME_IN_STATUS.20.2", "TIME_IN_STATUS.20.3"}),
    #"Added Custom1" = Table.AddColumn(#"Split Column by Delimiter20", "Open", each null),
    #"Added Custom2" = Table.AddColumn(#"Added Custom1", "Reopened", each null),
    #"Added Custom3" = Table.AddColumn(#"Added Custom2", "In Progress", each null),
    #"Added Custom4" = Table.AddColumn(#"Added Custom3", "Pending Prioritization", each null),
    #"Added Custom5" = Table.AddColumn(#"Added Custom4", "In Development", each null),
    #"Added Custom6" = Table.AddColumn(#"Added Custom5", "In Development Test", each null),
    #"Added Custom7" = Table.AddColumn(#"Added Custom6", "Emergency", each null),

etc (I have to shoorten the size)

#"Changed Type" = Table.TransformColumnTypes(#"Added Custom22",{{"TIME_IN_STATUS.1.1", Int64.Type}, {"TIME_IN_STATUS.1.2", Int64.Type}, {"TIME_IN_STATUS.1.3", Int64.Type}, {"TIME_IN_STATUS.2.1", Int64.Type}, {"TIME_IN_STATUS.2.2", Int64.Type}, {"TIME_IN_STATUS.2.3", Int64.Type}, {"TIME_IN_STATUS.3.1", Int64.Type}, {"TIME_IN_STATUS.3.2", Int64.Type}, {"TIME_IN_STATUS.3.3", Int64.Type}, {"TIME_IN_STATUS.4.1", Int64.Type}, {"TIME_IN_STATUS.4.2", Int64.Type}, {"TIME_IN_STATUS.4.3", Int64.Type}, {"TIME_IN_STATUS.5.1", Int64.Type}, {"TIME_IN_STATUS.5.2", Int64.Type}, {"TIME_IN_STATUS.5.3", Int64.Type}, {"TIME_IN_STATUS.6.1", Int64.Type}, {"TIME_IN_STATUS.6.2", Int64.Type}, {"TIME_IN_STATUS.6.3", Int64.Type}, {"TIME_IN_STATUS.7.1", Int64.Type}, {"TIME_IN_STATUS.7.2", Int64.Type}, {"TIME_IN_STATUS.7.3", Int64.Type}, {"TIME_IN_STATUS.8.1", Int64.Type}, {"TIME_IN_STATUS.8.2", Int64.Type}, {"TIME_IN_STATUS.8.3", Int64.Type}, {"TIME_IN_STATUS.9.1", Int64.Type}, {"TIME_IN_STATUS.9.2", Int64.Type}, {"TIME_IN_STATUS.9.3", Int64.Type}, {"TIME_IN_STATUS.10.1", Int64.Type}, {"TIME_IN_STATUS.10.2", Int64.Type}, {"TIME_IN_STATUS.10.3", Int64.Type}, {"TIME_IN_STATUS.11.1", Int64.Type}, {"TIME_IN_STATUS.11.2", Int64.Type}, {"TIME_IN_STATUS.11.3", Int64.Type}, {"TIME_IN_STATUS.12.1", Int64.Type}, {"TIME_IN_STATUS.12.2", Int64.Type}, {"TIME_IN_STATUS.12.3", Int64.Type}, {"TIME_IN_STATUS.13.1", Int64.Type}, {"TIME_IN_STATUS.13.2", Int64.Type}, {"TIME_IN_STATUS.13.3", Int64.Type}, {"TIME_IN_STATUS.14.1", Int64.Type}, {"TIME_IN_STATUS.14.2", Int64.Type}, {"TIME_IN_STATUS.14.3", Int64.Type}, {"TIME_IN_STATUS.15.1", Int64.Type}, {"TIME_IN_STATUS.15.2", Int64.Type}, {"TIME_IN_STATUS.15.3", Int64.Type}, {"TIME_IN_STATUS.16.1", Int64.Type}, {"TIME_IN_STATUS.16.2", Int64.Type}, {"TIME_IN_STATUS.16.3", Int64.Type}, {"TIME_IN_STATUS.17.1", Int64.Type}, {"TIME_IN_STATUS.17.2", Int64.Type}, {"TIME_IN_STATUS.17.3", Int64.Type}, {"TIME_IN_STATUS.18.1", Int64.Type}, {"TIME_IN_STATUS.18.2", Int64.Type}, {"TIME_IN_STATUS.18.3", Int64.Type}, {"TIME_IN_STATUS.19.1", Int64.Type}, {"TIME_IN_STATUS.19.2", Int64.Type}, {"TIME_IN_STATUS.19.3", Int64.Type}, {"TIME_IN_STATUS.20.1", Int64.Type}, {"TIME_IN_STATUS.20.2", Int64.Type}, {"TIME_IN_STATUS.20.3", Int64.Type}, {"DATEVALUE", Int64.Type}, {"VALUETYPE", Int64.Type}, {"Open", Int64.Type}, {"Reopened", Int64.Type}, {"In Progress", Int64.Type}, {"Pending Prioritization", Int64.Type}, {"In Development", Int64.Type}, {"In Development Test", Int64.Type}, {"Emergency", Int64.Type}, {"Emergency Resolved", Int64.Type}, {"In Emergency Test", Int64.Type}, {"In Release Test", Int64.Type}, {"Release Test Completed", Int64.Type}, {"Ready For Deployment", Int64.Type}, {"Not Prioritized", Int64.Type}, {"Customer Action Needed (In Progress)", Int64.Type}, {"Customer Action Needed (Pending Prioritization)", Int64.Type}, {"Customer Action Needed (In development)", Int64.Type}, {"Customer Action Needed (Ready for deployment)", Int64.Type}, {"Customer Action Needed (Resolved)", Int64.Type}, {"Customer Action Needed", Int64.Type}, {"Resolved", Int64.Type}, {"Closed", Int64.Type}, {"Junk", Int64.Type}}), #"Fill 1" = Table.FromRecords(Table.TransformRows(#"Changed Type", (r) => Record.TransformFields(r, {Choose(r[TIME_IN_STATUS.1.1]), each if r[TIME_IN_STATUS.1.1] <> null then r[TIME_IN_STATUS.1.3] else _}))), #"Fill 2" = Table.FromRecords(Table.TransformRows(#"Fill 1", (r) => Record.TransformFields(r, {Choose(r[TIME_IN_STATUS.2.1]), each if r[TIME_IN_STATUS.2.1] <> null then r[TIME_IN_STATUS.2.3] else _}))), #"Fill 3" = Table.FromRecords(Table.TransformRows(#"Fill 2", (r) => Record.TransformFields(r, {Choose(r[TIME_IN_STATUS.3.1]), each if r[TIME_IN_STATUS.3.1] <> null then r[TIME_IN_STATUS.3.3] else _}))),
etc. (I have to shoorten the size)

in #"Changed Type1"

 

Here is what's going on.

I have colunm which a rename to "TIME IN STATUS". this column contains a sting which holds the status number, the number of times the ticket have been in the status and the accumulated time in milliseconds.

I split the string into 20 column with the delimiter "|" because the maximum occurences is 18 so far.

Then I split each of them into 3 columns.

I create a number of columns to hold my data.

Then I use a user function to determine the different statuses in the first column and move the accumulated  time to the proper column.

In that user function I call another user function which will find the right column to copy to based on the status number.

I do that 20 times.

 

There is no order in which status is in which original column so I guess it needs to be complicated.

As I mentioned  previously, 18 (or 54) columns is not the best shape for your data. This is the performance issue - it has nothing to do with the Power Query code you posted. I suggest you unpivot your data (transform menu)

 

eg you want something like

 

date, status, time

1/1,  status 1,  5

1/1,   Status 2, 4

 

 

 

if you post a list of your column names and some sort of description (in case it is not obvious), I will help more



* Matt is an 8 times Microsoft MVP (Power BI) and author of the Power BI Book Supercharge Power BI.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.