Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Anonymous
Not applicable

I am stuck trying to figure out how to parse information

Hi there,

I am hitting a wall trying to figure out how to parse the information shown in the image below.

HoudiniUsage.JPG

This is from a log from a license server that plots how many licenses per hour are being pulled.

Col A is the timestamp for usage, B is just the measure of minutes, C is the number of users, D is the Software and E is the user information I need to parse.

What I need to come up with is a column for the user (red underline) and a column for the machine (blue underline) as users are not always on the same machine so we need to plot user and machine.  So my report will be based around the software, with the date as the filter, the user will be mapped to a craft group and the machine will be represented somewhere else.  I am just not sure for a single cell - E -, how to parse out the name before the @then the machine name after the @ but before the next , in order to then be able to use them as individual values for the date and product?

 

Sigh, sorry it's confusing just typing out what I need.

 

S.

 

25 REPLIES 25
PhilipTreacy
Super User
Super User

Hi @Anonymous 

 

But where are you seeing Some are number general, custom - d/mm/yyyy hh:mm or mm/dd/yyyy hh:mm  ?

 

That's not in a CSV, it's in Excel?

 

How did you get the data from the CSV into Excel?



Did I answer your question? Then please mark my post as the solution.
If I helped you, click on the Thumbs Up to give Kudos.


Blog :: YouTube Channel :: Connect on Linkedin


Proud to be a Super User!


PhilipTreacy
Super User
Super User

Hi @Anonymous 

 

Your comment 

 

Some are number general, custom - d/mm/yyyy hh:mm or mm/dd/yyyy hh:mm some are 12 hour, some are 24 hour, it's a bit of a mess really.

Tells me that you are looking at the dates once they are already in Excel.  This is not the way to do it.  You should be using Power Query to import the logs and do the necessary transformations before it gets into the worksheet. 

 

Not copying/pasting logs into Excel then using PQ to transform the data.

 

If you can just provide some sample logs that would be a big help.

 

Phil



Did I answer your question? Then please mark my post as the solution.
If I helped you, click on the Thumbs Up to give Kudos.


Blog :: YouTube Channel :: Connect on Linkedin


Proud to be a Super User!


Anonymous
Not applicable

Incorrect.  The log files come out as a csv.  I copied data from the csv to the sample csv I created for this exercise.

 

 

PhilipTreacy
Super User
Super User

Hi @Anonymous 

 

The code worked with the limited data you supplied (which did not include US dates) and the subsequent issues you had with US dates being converted incorrectly is fixed using Change Type -> Using Locale.

 

You now say that there are dates in more formats.  If you are unable to supply enough sample data that is representative of the problem then any solution you are given will be lacking.

 

If you can actually provide us with enough sample data we might be able to come up with a soution for you.

 

Regards

 

Phil



Did I answer your question? Then please mark my post as the solution.
If I helped you, click on the Thumbs Up to give Kudos.


Blog :: YouTube Channel :: Connect on Linkedin


Proud to be a Super User!


Anonymous
Not applicable

Thanks, obviously I thought I did from the subset of data from what I could see.. that data was copied from the CSV.

 

It wasn't until early hours this morning when this kept erroring out I scroll through thousands of lines of data and manually checked the format to discover when even looked like a date wasn't even a date format.

 

Also why I just put the summary of my finding on there.  What seemed like something actually wasn't as nothing worked and I didn't think to go through 180k of data to find a weird date format from an automated output.  The other 11 files I will just assume have the same issue.

 

 

ziying35
Impactful Individual
Impactful Individual

Hi, @Anonymous 

Try this:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMtE3NNI3MjAyUDAysDIwUNJRMgMTQOycX5QKpBLzsjNLih1SEvNK8vMz81J1gKwCh9KCnMy0EqVYHRQjDCk3wohyI4zJNMIAaIYxxAwDUGBQwRhD6hhjRB1jjMk0JhYA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t, Column3 = _t, Column4 = _t, Column5 = _t]),
    fx = (str)=> List.Transform(List.Split(Text.SplitAny(str,"@,"), 2), each Record.FromList(_, {"User","Machine"})),
    trans = Table.TransformColumns(Source, {"Column5", fx}),
    expd1 = Table.ExpandListColumn(trans, "Column5"),
    expd2 = Table.ExpandRecordColumn(expd1, "Column5", {"User", "Machine"}),
    chtype = Table.TransformColumnTypes(expd2,{{"Column1", type datetime}})
in
    chtype
Anonymous
Not applicable

@PhilipTreacy  the solution didn't work at the time I was trying it.

 

It seems with the column 1 ("Timestamp") when looking through about 180k lines of data there are about 5 different formats for the date field.

Some are number general, custom - d/mm/yyyy hh:mm or mm/dd/yyyy hh:mm some are 12 hour, some are 24 hour, it's a bit of a mess really.

I can't get any of the solutions to not error out on the Timestamp column so not sure what else is wrong there.

I will have to attempt to bounce this back to the vendor and see if they can figure out what their logging script is doing when it comes to the date and time.

 

Thanks for all the assistance.

PhilipTreacy
Super User
Super User

Hi Steve,

 

@Anonymous 

 

The solution I gave you on our forum worked fine.

 

 

 

 

let
    Source = Csv.Document(File.Contents("D:\temp\DudDate.csv"),[Delimiter=",", Columns=5, Encoding=1252, QuoteStyle=QuoteStyle.None]),
    #"Changed Type with Locale" = Table.TransformColumnTypes(Source, {{"Column1", type datetime}}, "en-US"),
    #"Renamed Columns1" = Table.RenameColumns(#"Changed Type with Locale",{{"Column2", "Minutes"}, {"Column3", "Users"}, {"Column4", "Software"}, {"Column5", "User Info"}, {"Column1", "Timestamp"}}),
    #"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns1",{ {"Minutes", Int64.Type}, {"Users", Int64.Type}, {"Software", type text}, {"User Info", type text}}),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Changed Type", "User Info", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), {"User Info.1", "User Info.2", "User Info.3", "User Info.4", "User Info.5", "User Info.6"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"User Info.1", type text}, {"User Info.2", type text}, {"User Info.3", type text}, {"User Info.4", type text}, {"User Info.5", type text}, {"User Info.6", type text}}),
    #"Removed Columns" = Table.RemoveColumns(#"Changed Type1",{"Minutes", "Users", "Software"}),
    #"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Removed Columns", {"Timestamp"}, "Attribute", "Value"),
    #"Removed Columns1" = Table.RemoveColumns(#"Unpivoted Other Columns",{"Attribute"}),
    #"Merged Queries" = Table.NestedJoin(#"Removed Columns1", {"Timestamp"}, #"Changed Type", {"Timestamp"}, "Table1 (2)", JoinKind.LeftOuter),
    #"Expanded Table1 (2)" = Table.ExpandTableColumn(#"Merged Queries", "Table1 (2)", {"Minutes", "Users", "Software"}, {"Table1 (2).Minutes", "Table1 (2).Users", "Table1 (2).Software"}),
    #"Reordered Columns" = Table.ReorderColumns(#"Expanded Table1 (2)",{"Timestamp", "Table1 (2).Minutes", "Table1 (2).Users", "Table1 (2).Software", "Value"}),
    #"Split Column by Delimiter1" = Table.SplitColumn(#"Reordered Columns", "Value", Splitter.SplitTextByDelimiter("@", QuoteStyle.Csv), {"Value.1", "Value.2"}),
    #"Changed Type2" = Table.TransformColumnTypes(#"Split Column by Delimiter1",{{"Value.1", type text}, {"Value.2", type text}}),
    #"Renamed Columns" = Table.RenameColumns(#"Changed Type2",{{"Value.1", "User"}, {"Value.2", "Machine"}, {"Table1 (2).Software", "Software"}, {"Table1 (2).Users", "Number of Users"}, {"Table1 (2).Minutes", "Minutes"}})
in
    #"Renamed Columns"

 

 

 

 

 

 

The issue with the dates is because they are in US format and you need to transform them using Change Type -> Using Locale.  The above query loads the sample CSV you supplied.

 

You'll need to alter the Source to pick up whatever source you are actually using.

 

If you have a table in a workbook called ServerLogs then change this

 

 

Source = Csv.Document(File.Contents("D:\temp\DudDate.csv"),[Delimiter=",", Columns=5, Encoding=1252, QuoteStyle=QuoteStyle.None]),

 

 

to this

 

 

Source = Excel.CurrentWorkbook(){[Name="ServerLogs"]}[Content],

 

 

 

Regards

 

Phil



Did I answer your question? Then please mark my post as the solution.
If I helped you, click on the Thumbs Up to give Kudos.


Blog :: YouTube Channel :: Connect on Linkedin


Proud to be a Super User!


shaowu459
Resolver II
Resolver II

Hi @Anonymous 

 

Please see if this helps, "Information" is the table name of your source data.

 

let
    Source = Excel.CurrentWorkbook(){[Name="Information"]}[Content],
    res = #table(List.RemoveLastN(Table.ColumnNames(Source))&{"User","Machine"},List.TransformMany(Table.ToRows(Source),each Text.Split(List.Last(_),","),(x,y)=>List.RemoveLastN(x)&Text.Split(y,"@")))
in
    res

 

1.png

 

Anonymous
Not applicable

That looks amazing!  

 

I'll have a crack at that soon once I finish up some meetings.  Thanks heaps for putting that together.  I'll come back to the thread when I attempt it.

 

Cheers

 

Anonymous
Not applicable

ok that almost works.  I come unstuck when there are null values for the user column.

 

I have cleared them out with a transform step, but then in the Advanced Editor (in Excel) I am not sure how to use your part of the script in amongst the transform steps that are there already?

try change "Source" in my code to the previous step name.

res = #table(List.RemoveLastN(Table.ColumnNames(The step Name of your transform step))&{"User","Machine"},List.TransformMany(Table.ToRows(The step Name of your transform step),each Text.Split(List.Last(_),","),(x,y)=>List.RemoveLastN(x)&Text.Split(y,"@")))
Anonymous
Not applicable

That worked a treat.. BUT.. and there is always a but.  It seems the excel Timestamp column changes randomly from general number to custom - this is the doing of the server spitting out the log.

I can't find a way to change it to a date and time format when there are two different formats in the one column?!

For example:

1.png

Anonymous
Not applicable

 

 

let
    Source = Excel.CurrentWorkbook(){[Name="Information"]}[Content],
    Custom1 = Table.TransformColumns(Source,{"Timestamp",each DateTime.From(_)}),
    #"Changed Type" = Table.TransformColumnTypes(Custom1,{{"Timestamp", type text}, {"Duration", Int64.Type}, {"Peak", Int64.Type}, {"Product", type text}, {"Users", type text}}),
    #"Filtered Rows" = Table.SelectRows(#"Changed Type", each [Users] <> null and [Users] <> ""),
    res = #table(List.RemoveLastN(Table.ColumnNames(#"Filtered Rows"))&{"User","Machine"},List.TransformMany(Table.ToRows(#"Filtered Rows"),each Text.Split(List.Last(_),","),(x,y)=>List.RemoveLastN(x)&Text.Split(y,"@")))
in
    res

I have tried this but still making a meal out of it.  I have tried to remove the Changed Type also, just keeps erroring out. 

What is the error message says? I used your code above and get no errors. 

1.png 

Anonymous
Not applicable

DataFormat.Error: We couldn't parse the input provided as a DateTime value.
Details:
    03/18/2020 17:00:00

 

Did you format the date in the same rubbish way I have it here.

 

One Date value is Number - Custom - d/mm/yyyy h:mm

the other is Number - General

 

The problem is also, the ones that are randomly displaying as Number - Custom are incorrect.

5/12/2020 22:00
5/12/2020 23:00
05/13/2020 00:00:00
05/13/2020 01:00:00

 

The top two in the spreadsheet are d/mm/yyyy h:mm and that is wrong as the date being displayed there is May 12th but the format would have it as Dec 5th..  It's all over the shop.

Could you provide the Excel file? Save the file to some cloud drive and share the link here.

Anonymous
Not applicable

Try this https://drive.google.com/file/d/1I5XCBEarFIsinW5PAggWlyIHWX0hcI8E/view?usp=sharing

 

The lines should be 12th and 13th of April and the relevant hour.

 

You'll see the mixed bag of formating.

Copy and paste below code to advance editor:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMtE3NNI3MjAyUDAysDIwUNJRMgMTQOycX5QKpBLzsjNLih1SEvNK8vMz81J1gKwCh9KCnMy0EqVYHRQjDCk3wohyI4zJNMIAaIYxxAwDUGBQwRhD6hhjRB1jjMk0JhYA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t, Column3 = _t, Column4 = _t, Column5 = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type datetime}}),
    res = #table(List.RemoveLastN(Table.ColumnNames(#"Changed Type"))&{"User","Machine"},List.TransformMany(Table.ToRows(#"Changed Type"),each Text.Split(List.Last(_),","),(x,y)=>List.RemoveLastN(x)&Text.Split(y,"@")))
in
    res

 

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors