Reply
Frequent Visitor
Posts: 4
Registered: ‎03-16-2017
Accepted Solution

Extracting the duration from different rows

[ Edited ]

Hi,

 

I'm trying to analyze access control data. I believe it's not complex, but I'm struggling until I get more familiar with Power BI. The data is in this format:

PowerBI_Example.jpg

 

What I want to extract is the duration of the session for each user (plus some other information I've got in extra columns) having into consideration that are more records for the different users than the ones showed in the example.

 

Any help will be really appreciated.


Antiparras

 

 

 


Accepted Solutions
New Contributor
Posts: 505
Registered: ‎11-25-2016

Re: Extracting the duration from different rows

[ Edited ]

Still working fine with me.

 

Maybe you can verify - Edit: in the Query Editor at step #"Added Custom4" - with the code below (adjust for your source) in which I created separate columns for each condition that must all be TRUE for the duration to be calculated. If at least one of the conditions is FALSE, Duration will be null.

 

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}, {"Time", type time}, {"Action", type text}, {"Service", type text}, {"User", type text}, {"Server", type text}}),
    #"Inserted Merged Date and Time" = Table.AddColumn(#"Changed Type", "DateTime", each [Date] & [Time], type datetime),
    #"Added Index" = Table.AddIndexColumn(#"Inserted Merged Date and Time", "Original Sort", 0, 1),
    #"Sorted Rows" = Table.Sort(#"Added Index",{{"User", Order.Ascending}, {"Service", Order.Ascending}, {"DateTime", Order.Ascending}}),
    #"Added Index1" = Table.AddIndexColumn(#"Sorted Rows", "Index", 0, 1),
    #"Added Index2" = Table.AddIndexColumn(#"Added Index1", "Index.1", 1, 1),
    #"Merged Queries" = Table.NestedJoin(#"Added Index2",{"Index"},#"Added Index2",{"Index.1"}, "Previous",JoinKind.LeftOuter),
    #"Expanded Previous" = Table.ExpandTableColumn(#"Merged Queries", "Previous", {"Action", "User", "Service", "DateTime"}, {"Previous.Action", "Previous.User", "Previous.Service", "Previous.DateTime"}),
    #"Added Custom" = Table.AddColumn(#"Expanded Previous", "CheckUser", each [User] = [Previous.User]),
    #"Added Custom1" = Table.AddColumn(#"Added Custom", "CheckService", each [Service] = [Previous.Service]),
    #"Added Custom2" = Table.AddColumn(#"Added Custom1", "CheckOut", each [Action] = "OUT"),
    #"Added Custom3" = Table.AddColumn(#"Added Custom2", "CheckIn", each [Previous.Action] = "IN"),
    #"Added Custom4" = Table.AddColumn(#"Added Custom3", "Duration", each if [CheckUser] and [CheckService] and [CheckOut] and [CheckIn] then [DateTime] - [Previous.DateTime] else null),
    #"Sorted Rows1" = Table.Sort(#"Added Custom4",{{"Original Sort", Order.Ascending}}),
    #"Removed Other Columns" = Table.SelectColumns(#"Sorted Rows1",{"DateTime", "Action", "Duration", "User", "Service", "Server"})
in
    #"Removed Other Columns"

 

Specializing in Power Query Formula Language (M)

View solution in original post


All Replies
Highlighted
New Contributor
Posts: 505
Registered: ‎11-25-2016

Re: Extracting the duration from different rows

[ Edited ]

Steps taken in the code below:

1. Import Excel table and change type (code adjusted for date and time).

2. Merge Date & Time

3. Add index column as original sort so records can be sorted back at the end.

3. Sort on User and DateTime.

4. Add 2 Indices (starting at 0 and at 1), so the table can be merged with itself (current row with previous row).

5. Merge and expand with required fields.

6. Add duration (if same user and current action = OUT and previous action = IN.

7. Sort back to original sort.

8. Remove all added columns except Duration.

 

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}, {"Time", type time}, {"Action", type text}, {"User", type text}}),
    #"Inserted Merged Date and Time" = Table.AddColumn(#"Changed Type", "DateTime", each [Date] & [Time], type datetime),
    #"Added Index" = Table.AddIndexColumn(#"Inserted Merged Date and Time", "Original Sort", 0, 1),
    #"Sorted Rows" = Table.Sort(#"Added Index",{{"User", Order.Ascending}, {"DateTime", Order.Ascending}}),
    #"Added Index1" = Table.AddIndexColumn(#"Sorted Rows", "Index", 0, 1),
    #"Added Index2" = Table.AddIndexColumn(#"Added Index1", "Index.1", 1, 1),
    #"Merged Queries" = Table.NestedJoin(#"Added Index2",{"Index"},#"Added Index2",{"Index.1"},"Previous",JoinKind.LeftOuter),
    #"Expanded Previous" = Table.ExpandTableColumn(#"Merged Queries", "Previous", {"Action", "User", "DateTime"}, {"Previous.Action", "Previous.User", "Previous.DateTime"}),
    #"Added Custom" = Table.AddColumn(#"Expanded Previous", "Duration", each if [User] = [Previous.User] and [Action] = "OUT" and [Previous.Action] = "IN" then [DateTime] - [Previous.DateTime] else null),
    #"Sorted Rows1" = Table.Sort(#"Added Custom",{{"Original Sort", Order.Ascending}}),
    #"Removed Other Columns" = Table.SelectColumns(#"Sorted Rows1",{"Date", "Time", "Action", "User", "Duration"})
in
    #"Removed Other Columns"

 

Specializing in Power Query Formula Language (M)
Frequent Visitor
Posts: 4
Registered: ‎03-16-2017

Re: Extracting the duration from different rows

Thank you for your response Marcel,

 

I've used your code and I realized I need to use an additional column, 'Service', as the same user can be login into different services at the same time. I came with the following code:

 

let
    Source = Excel.Workbook(File.Contents("D:\Access\access_control.xlsx"), null, true),
    Sheet2_Sheet = Source{[Item="Sheet2",Kind="Sheet"]}[Data],
    #"Promoted Headers" = Table.PromoteHeaders(Sheet2_Sheet, [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Date", type date}, {"Time", type time}, {"Action", type text}, {"Service", type text}, {"User", type text}, {"Server", type text}}),
    #"Inserted Merged Date and Time" = Table.AddColumn(#"Changed Type", "DateTime", each [Date] & [Time], type datetime),
    #"Added Index" = Table.AddIndexColumn(#"Inserted Merged Date and Time", "Original Sort", 0, 1),
    #"Sorted Rows" = Table.Sort(#"Added Index",{{"User", Order.Ascending}, {"Service", Order.Ascending}, {"DateTime", Order.Ascending}}),
    #"Added Index1" = Table.AddIndexColumn(#"Sorted Rows", "Index", 0, 1),
    #"Added Index2" = Table.AddIndexColumn(#"Added Index1", "Index.1", 1, 1),
    #"Merged Queries" = Table.NestedJoin(#"Added Index2",{"Index"},#"Added Index2",{"Index.1"}, "Previous",JoinKind.LeftOuter),
    #"Expanded Previous" = Table.ExpandTableColumn(#"Merged Queries", "Previous", {"Action", "User", "Service", "DateTime"}, {"Previous.Action", "Previous.User", "Previous.Service", "Previous.DateTime"}),
    #"Added Custom" = Table.AddColumn(#"Expanded Previous", "Duration", each if [User] = [Previous.User] and [Service] = [Previous.Service] and [Action] = "OUT" and [Previous.Action] = "IN" then [DateTime] - [Previous.DateTime] else null),
    #"Sorted Rows1" = Table.Sort(#"Added Custom",{{"Original Sort", Order.Ascending}}),
    #"Removed Other Columns" = Table.SelectColumns(#"Sorted Rows1",{"DateTime", "Action", "Duration", "User", "Service", "Server"})
in
    #"Removed Other Columns"

But still I get null in 'Duration'. Could you give me a hand with this?

 

Thanks.

New Contributor
Posts: 505
Registered: ‎11-25-2016

Re: Extracting the duration from different rows

Your code works fine with me.

Possibly your data have no matching combinations?

 

I aded an IN and an OUT for user A and Service B.

 

Extract duration from different rows.png

 

 

Specializing in Power Query Formula Language (M)
Frequent Visitor
Posts: 4
Registered: ‎03-16-2017

Re: Extracting the duration from different rows

Funny...

 

First, thanks for your help. It still doesn't work for me, even when I simplify the source to:

 

DateTimeActionUserServiceServer
01/03/201610:50:34INAA001SERV1
01/03/201610:50:35INAA002SERV1
01/03/201610:50:36INAA003SERV2
01/03/201610:50:37INBA003SERV2
01/03/201610:50:38INAA004SERV1
01/03/201610:50:39INCA002SERV1
01/03/201612:34:22OUTAA001SERV1
01/03/201612:34:24OUTAA002SERV1
01/03/201612:34:29OUTAA003SERV2
20/03/201610:50:34OUTCA002SERV1
20/03/201610:50:40OUTBA003SERV2
20/03/201610:50:50OUTAA004SERV1

 

I still receive null in Duration. I'm thinking something with Excel format or version (Excel 2010)???

 

Thanks.

New Contributor
Posts: 505
Registered: ‎11-25-2016

Re: Extracting the duration from different rows

[ Edited ]

Still working fine with me.

 

Maybe you can verify - Edit: in the Query Editor at step #"Added Custom4" - with the code below (adjust for your source) in which I created separate columns for each condition that must all be TRUE for the duration to be calculated. If at least one of the conditions is FALSE, Duration will be null.

 

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}, {"Time", type time}, {"Action", type text}, {"Service", type text}, {"User", type text}, {"Server", type text}}),
    #"Inserted Merged Date and Time" = Table.AddColumn(#"Changed Type", "DateTime", each [Date] & [Time], type datetime),
    #"Added Index" = Table.AddIndexColumn(#"Inserted Merged Date and Time", "Original Sort", 0, 1),
    #"Sorted Rows" = Table.Sort(#"Added Index",{{"User", Order.Ascending}, {"Service", Order.Ascending}, {"DateTime", Order.Ascending}}),
    #"Added Index1" = Table.AddIndexColumn(#"Sorted Rows", "Index", 0, 1),
    #"Added Index2" = Table.AddIndexColumn(#"Added Index1", "Index.1", 1, 1),
    #"Merged Queries" = Table.NestedJoin(#"Added Index2",{"Index"},#"Added Index2",{"Index.1"}, "Previous",JoinKind.LeftOuter),
    #"Expanded Previous" = Table.ExpandTableColumn(#"Merged Queries", "Previous", {"Action", "User", "Service", "DateTime"}, {"Previous.Action", "Previous.User", "Previous.Service", "Previous.DateTime"}),
    #"Added Custom" = Table.AddColumn(#"Expanded Previous", "CheckUser", each [User] = [Previous.User]),
    #"Added Custom1" = Table.AddColumn(#"Added Custom", "CheckService", each [Service] = [Previous.Service]),
    #"Added Custom2" = Table.AddColumn(#"Added Custom1", "CheckOut", each [Action] = "OUT"),
    #"Added Custom3" = Table.AddColumn(#"Added Custom2", "CheckIn", each [Previous.Action] = "IN"),
    #"Added Custom4" = Table.AddColumn(#"Added Custom3", "Duration", each if [CheckUser] and [CheckService] and [CheckOut] and [CheckIn] then [DateTime] - [Previous.DateTime] else null),
    #"Sorted Rows1" = Table.Sort(#"Added Custom4",{{"Original Sort", Order.Ascending}}),
    #"Removed Other Columns" = Table.SelectColumns(#"Sorted Rows1",{"DateTime", "Action", "Duration", "User", "Service", "Server"})
in
    #"Removed Other Columns"

 

Specializing in Power Query Formula Language (M)
Frequent Visitor
Posts: 4
Registered: ‎03-16-2017

Re: Extracting the duration from different rows

Thank you again for your help Marcel,

 

That made the trick and I added some code to get rid of spaces, in case this was the issue too.