cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
mikejarod Helper III
Helper III

Conditionally combining multiple rows into one column

I have a folder full of files in the format below. I want to load these into PowerBI using Power Query.
See attached picture, I colored every column with its own color.

Loading and splitting the columns that are NOT colored orange are not a problem.

The orange values are an error message. Most of the time these are short string in the same line, but sometimes they contain detailed error logs of mulitple rows. 

How can I get the orange colored values into one column, where the multiple row logs will end up in one "cell" or value ?

 

WARN [Import schedule Elvaco importer] [2019-06-23 08:12:18,620] - (com.energyict.mdw.importimpl.FileImportImpl:process) - MDI202 End Import: severe errors occurred 
ERROR [Import schedule Elvaco importer] [2019-06-23 08:12:22,522] - File import failure
com.energyict.delta.mdr.imp.csvconsumption.exception.DeviceChannelCombinationNotFoundException: DEL-DVAL-20: No device with serial-number '00691138' has a channel with 'CHN001' after the last slash ('/') in the external name.
	at com.energyict.delta.mdr.imp.csvconsumption.CsvConsumptionDataParser.checkIfRtuWithChannelExistsAndSetChannelInModel(CsvConsumptionDataParser.java:209)
	at com.energyict.delta.mdr.imp.csvconsumption.CsvConsumptionDataParser.parseDataLine(CsvConsumptionDataParser.java:133)
	at com.energyict.delta.mdr.imp.csvconsumption.CsvConsumptionDataImporter.importReader(CsvConsumptionDataImporter.java:77)
	at com.energyict.eisimport.core.AbstractReaderImporter.doImportFile(AbstractReaderImporter.java:48)
	at com.energyict.eisimport.core.AbstractImporter.importFile(AbstractImporter.java:93)
	at com.energyict.mdw.importimpl.FileImportImpl.doProcess(FileImportImpl.java:236)
	at com.energyict.mdw.importimpl.FileImportImpl.access$100(FileImportImpl.java:25)
	at com.energyict.mdw.importimpl.FileImportImpl$3.doExecute(FileImportImpl.java:225)
	at com.energyict.cpo.Environment.execute(Environment.java:361)
	at com.energyict.cpo.Environment.execute(Environment.java:339)
	at com.energyict.mdw.importimpl.FileImportImpl.processFile(FileImportImpl.java:232)
	at com.energyict.mdw.importimpl.FileImportImpl.process(FileImportImpl.java:200)
	at com.energyict.mdw.importimpl.ImportMappingImpl.process(ImportMappingImpl.java:300)
	at com.energyict.mdw.importimpl.ImportMappingImpl.execute(ImportMappingImpl.java:248)
	at com.energyict.mdw.importimpl.ImportScheduleImpl.doExecute(ImportScheduleImpl.java:270)
	at com.energyict.mdw.importimpl.ScheduleImpl.execute(ScheduleImpl.java:79)
	at com.energyict.mdw.importimpl.ImportScheduleProcess.doRun(ImportScheduleProcess.java:187)
	at com.energyict.mdw.importimpl.ImportScheduleProcess.run(ImportScheduleProcess.java:151)
	at java.lang.Thread.run(Thread.java:748)
 WARN [Import schedule Elvaco importer] [2019-06-23 08:12:22,569] - (com.energyict.mdw.importimpl.FileImportImpl:process) - MDI201 Import failed: com.energyict.delta.mdr.imp.csvconsumption.exception.DeviceChannelCombinationNotFoundException: DEL-DVAL-20: No device with serial-number '00691138' has a channel with 'CHN001' after the last slash ('/') in the external name. 

 

logfile.png

1 ACCEPTED SOLUTION

Accepted Solutions
mikejarod Helper III
Helper III

Re: Conditionally combining multiple rows into one column

@PhilC thank you for your help, I think I solved it using parts of your script.

 

1. Import file in 1 column

2. Insert new column, with values of original column only if the line starts with (ERROR, WARN, FATAL). So the messages containing multiple lines will not be put into this new column

3. Fill new column down

4. Group by on this new column as the key (it contains a timestamp so thats possible), and the original column as the value, and removing linefeeds in the grouping process:

#"Group" = Table.Group(#"Fill Down", {"Temp"}, {{"Column1", each Text.Combine([Column1],"#(lf)"), type text}})

5. This solves the original problem. Now split by delimiters and set column types to finish

 

@PhilC please explain what your thinking process was by adding the index column and the logic that uses this index column? I didn't need this part and this made it slow. Maybe I'm missing something?

View solution in original post

10 REPLIES 10
v-piga-msft Resident Rockstar
Resident Rockstar

Re: Conditionally combining multiple rows into one column

Hi @mikejarod ,

Do you want to get the output below? 

Untitled.png

If so, we could achieve that in Query Editor. Please refer to the Applied Steps in my attachement.

If you still need help, please share your desired output so that we could help further on it.

Best Regards,

Cherry

 

Community Support Team _ Cherry Gao
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
mikejarod Helper III
Helper III

Re: Conditionally combining multiple rows into one column

Hi @v-piga-msft 

Thanks for your reply. Unless I understand your solution incorrectly, this is not the output I'm looking for. I attached a screenshot of an Excel table with the preferred output. The first 4 columns are not a problem for me, column5 is.

wantedoutput.png

PhilC Helper II
Helper II

Re: Conditionally combining multiple rows into one column

HI, in the data you provided initially, has it been edited at all or is it exactly as per the original file?  I am wondering why for the second record the "com...." is on its own line and not following on from "File import failure"

mikejarod Helper III
Helper III

Re: Conditionally combining multiple rows into one column

Hi @PhilC , the data is original (except for the emoticons parsed by this forum). 

What you are referring to is the exact reason I need help 🙂

When a row starts with a value not equal to ERROR, then the last value in the row is the error message.

When a row starts with value ERROR, then the last value is not available, and the error message is in the following lines until a new line appears starting with starting value in ([space]WARN, ERROR, FATAL).

It seems I cannot attach a txt file here so here is a new example, sliglty edited to make it shorter:

FATAL [CPS queue thread 0] [2019-06-25 00:30:52,426] - (com.energyict.nl.cps.mail:checkBouncyCastleProvider) - Can't find BC provider - adding
 WARN [EDINE queue thread 3] [2019-06-25 00:50:11,005] - (nl.sogeti.delta.pluggable.export.export) - Er zijn geen meterstand op te vragen uit het TM
 WARN [EDINE queue thread 3] [2019-06-25 01:50:29,938] - (nl.sogeti.delta.pluggable.export.export) - Er zijn geen meterstand op te vragen uit het TM
ERROR [FTP queue thread 3] [2019-06-25 02:41:49,751] - Connection timed out: connect
java.net.ConnectException: Connection timed out: connect
	at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method)
	at java.lang.Thread.run(Thread.java:748)
 WARN [FTP queue thread 3] [2019-06-25 02:41:49,767] - (com.energyict.mdw.exportimpl.logFailure) - Export handler error : java.net.ConnectException: Connection timed out
ERROR [FTP queue thread 2] [2019-06-25 02:41:52,184] - java.net.ConnectException: Connection timed out: connect
java.lang.RuntimeException: java.net.ConnectException: Connection timed out: connect
	at com.energyict.mdw.exportimpl.SFtpActionStrategy$SftpClient.connectAndLogIn(SFtpActionStrategy.java:206)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.ConnectException: Connection timed out: connect
	at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
	... 17 more
 WARN [FTP queue thread 2] [2019-06-25 02:41:52,184] - (com.energyict.mdw.exportimpl.logFailure) - Export handler error : Connection timed out: connect

 2019-06-25_08-04-03.png

Also attached a screenshot of this text in Notepad++ including the indicators for Carriage Return, Line Feed, Spaces and Tabs.

 

mikejarod Helper III
Helper III

Re: Conditionally combining multiple rows into one column

I have a functional idea how to do this but I'm not skilled enough in M to write script for this.

"Break up into seperate columns" is not specified because it should not be the problem here.

 

IF SUBSTRING(line, 1,5) in ([space]WARN, FATAL) THEN read entire line until carriage return line break, and then break up in seperate columns 

 

IF SUBSTRING(line, 1,5) IN (ERROR) then remove every carriage return line break for the current line and all following lines until a new line starts with SUBSTRING(line, 1,5) in ([space]WARN, FATAL), and then break up in seperate columns

 

I found options in M to replace/remove carriage raturn / line breaks but this seems to work only after importing, and it should be done before importing / at the importing stage.

PhilC Helper II
Helper II

Re: Conditionally combining multiple rows into one column

The following code works for the original data saved as a text file, hoping it works for the full dataset.  It feels like there should be a more elegant solution, but would require a level of knowledge greater than mine.

 

let
    Source = Csv.Document(File.Contents("N:\DVC-R\8 ED(RO)\Data and Statistics\Subscriptions, Resources and Software\Power BI\Support\Log File Extract.txt"),[Delimiter="|", Columns=1, Encoding=1252, QuoteStyle=QuoteStyle.None]),
    #"Added Custom" = Table.AddColumn(Source, "Row Begin Check", each if Text.Start(Text.Trim([Column1]),1) = Text.Upper(Text.Start(Text.Trim([Column1]),1)) then [Column1] else null, type text),
    #"Added Index" = Table.AddIndexColumn(#"Added Custom", "Index", 1, 1),
    #"Changed Type" = Table.TransformColumnTypes(#"Added Index",{{"Index", Int64.Type}}),
    #"Added Custom1" = Table.AddColumn(#"Changed Type", "Custom", each try

if [Row Begin Check] <> null and #"Changed Type"[Row Begin Check]{[Index]} = null then " - "&#"Changed Type"[Column1]{[Index]} else null

otherwise null),
    #"Added Custom2" = Table.AddColumn(#"Added Custom1", "Custom.1", each try 

if #"Changed Type"[Row Begin Check]{[Index]-2} <> null and [Row Begin Check] = null  then 1 else null

otherwise null),
    #"Filtered Rows" = Table.SelectRows(#"Added Custom2", each ([Custom.1] = null)),
    #"Inserted Merged Column" = Table.AddColumn(#"Filtered Rows", "Merged", each Text.Combine({[Row Begin Check], [Custom]}, ""), type text),
    #"Replaced Value" = Table.ReplaceValue(#"Inserted Merged Column","",null,Replacer.ReplaceValue,{"Merged"}),
    #"Filled Down2" = Table.FillDown(#"Replaced Value",{"Merged"}),
    #"Added Custom3" = Table.AddColumn(#"Filled Down2", "Custom.2", each if [Row Begin Check] = null then [Column1] else [Merged]),
    #"Grouped Rows1" = Table.Group(#"Added Custom3", {"Merged"}, {{"Combined", each Text.Combine([Custom.2],"#(lf)"), type text}}),
    #"Removed Other Columns" = Table.SelectColumns(#"Grouped Rows1",{"Combined"}),
    #"Split Column by Delimiter1" = Table.SplitColumn(#"Removed Other Columns", "Combined", Splitter.SplitTextByEachDelimiter({"["}, QuoteStyle.Csv, false), {"Column 1", "Column Temp"}),
    #"Split Column by Delimiter2" = Table.SplitColumn(#"Split Column by Delimiter1", "Column Temp", Splitter.SplitTextByEachDelimiter({"] ["}, QuoteStyle.Csv, false), {"Column 2", "Column Temp"}),
    #"Split Column by Delimiter3" = Table.SplitColumn(#"Split Column by Delimiter2", "Column Temp", Splitter.SplitTextByEachDelimiter({"] - "}, QuoteStyle.Csv, false), {"Column 3", "Column Temp"}),
    #"Split Column by Delimiter4" = Table.SplitColumn(#"Split Column by Delimiter3", "Column Temp", Splitter.SplitTextByEachDelimiter({" - "}, QuoteStyle.Csv, false), {"Column 4", "Column 5"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter4",{{"Column 1", type text}, {"Column 2", type text}, {"Column 3", type text}, {"Column 4", type text}, {"Column 5", type text}})
in
    #"Changed Type1"

Cheers,

PhilC

mikejarod Helper III
Helper III

Re: Conditionally combining multiple rows into one column

Thanks! I guess there is a infinite loop problem because with a full production file it just keeps on loading and says the file is gigabytes in size, and eventually my system is out of memory and it crashes 🙂 But I will try to understand your steps and find the problem, it's a good start.

Highlighted
PhilC Helper II
Helper II

Re: Conditionally combining multiple rows into one column

Ah, sorry.  Using the Index offset might be the issue on a larger number of rows.

mikejarod Helper III
Helper III

Re: Conditionally combining multiple rows into one column

@PhilC thank you for your help, I think I solved it using parts of your script.

 

1. Import file in 1 column

2. Insert new column, with values of original column only if the line starts with (ERROR, WARN, FATAL). So the messages containing multiple lines will not be put into this new column

3. Fill new column down

4. Group by on this new column as the key (it contains a timestamp so thats possible), and the original column as the value, and removing linefeeds in the grouping process:

#"Group" = Table.Group(#"Fill Down", {"Temp"}, {{"Column1", each Text.Combine([Column1],"#(lf)"), type text}})

5. This solves the original problem. Now split by delimiters and set column types to finish

 

@PhilC please explain what your thinking process was by adding the index column and the logic that uses this index column? I didn't need this part and this made it slow. Maybe I'm missing something?

View solution in original post

Helpful resources

Announcements
New Ranks Launched March 24th!

New Ranks Launched March 24th!

The time has come: We are finally able to share more details on the brand-new ranks coming to the Power BI Community!

‘Better Together’ Contest Finalists Announced!

‘Better Together’ Contest Finalists Announced!

Congrats to the finalists of our ‘Better Together’-themed T-shirt design contest! Click for the top entries.

Arun 'Triple A' Event Video, Q&A, and Slides

Arun 'Triple A' Event Video, Q&A, and Slides

Missed the Arun 'Triple A' event or want to revisit it? We've got you covered! Check out the video, Q&A, and slides now.

Join THE global Microsoft Power Platform event series.

Join THE global Power Platform event series.

Attend for two days of expert-led learning and innovation on topics like AI and Analytics, powered by Dynamic Communities.

Community Summit North America

Community Summit North America

Innovate, Collaborate, Grow. The top training and networking event across the globe for Microsoft Business Applications

Top Solution Authors
Top Kudoed Authors