Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
Davykyndt
Frequent Visitor

Merge records with different time, where the time difference is smaller than 1 hour.

Hi,

I have a bunch of IP camera's, who are monitored by a ip scanning program.

Every 120 seconds all camera's are being pinged, the camera's thar are not active are logged.
I'd like to visualize when people are disabling the camera's when working.

So i have a log that looks like this.

Timestamp       IP          MSG
3/01/2017 4:05 192.168.1.23 Request Timeout
3/01/2017 7:24 192.168.1.23 Request Timeout
3/02/2017 4:05 192.168.1.23 Request Timeout
3/02/2017 4:11 192.168.1.23 Request Timeout
3/03/2017 4:12 192.168.1.23 Request Timeout
3/03/2017 11:50 192.168.1.23 Request Timeout
3/03/2017 11:52 192.168.1.23 Request Timeout
3/03/2017 11:54 192.168.1.23 Request Timeout
3/03/2017 11:56 192.168.1.23 Request Timeout
3/03/2017 11:58 192.168.1.23 Request Timeout
...
Same data every 2 minutes
...
3/03/2017 15:55 192.168.1.23 Request Timeout
3/03/2017 15:57 192.168.1.23 Request Timeout
3/03/2017 15:59 192.168.1.23 Request Timeout
3/03/2017 16:49 192.168.1.23 Request Timeout


3/04/2017 11:50 192.168.1.23 Request Timeout
3/04/2017 11:52 192.168.1.23 Request Timeout
3/04/2017 11:54 192.168.1.23 Request Timeout
3/04/2017 11:56 192.168.1.23 Request Timeout
3/04/2017 11:58 192.168.1.23 Request Timeout
...
Same data every 2 minutes
...
3/04/2017 15:55 192.168.1.23 Request Timeout
3/04/2017 15:57 192.168.1.23 Request Timeout
3/04/2017 15:59 192.168.1.23 Request Timeout
3/04/2017 16:49 192.168.1.23 Request Timeout
28/02/2017 10:07:17 192.168.1.23 Request Timeout
28/02/2017 12:58:00 192.168.1.23 Request Timeout
28/02/2017 4:05:47 192.168.1.23 Request Timeout

I would like to generate a column where a row contains the from and to time.
the timeframe between 2 rows from the same ip addres must be greater then 1 hour.
This means when the camera is unreachable every 2 minutes for more than 4 hours it is 1 event.
When the camera is reachable again, and becomes unreachable withing that hour, it is the same event and the end time is this time of the row.
I do not need single rows in the event table

When the camera is reachable again, en becomes unreachable the next day, it is considered as a new event.

I would like to have an output event table like this.


IP address   | Timestamp start| Timestamp end
192.168.1.23 | 3/01/2017 4:05 | 3/01/2017 4:05 <== I do not want this single event in the table
192.168.1.23 | 3/02/2017 4:05 | 3/02/2017 4:11
192.168.1.23 | 3/03/2017 11:50| 3/03/2017 16:49
192.168.1.23 | 4/03/2017 11:50| 4/03/2017 16:49


Any change on how to do this in PowerBI.

1 ACCEPTED SOLUTION
MarcelBeug
Community Champion
Community Champion

That would become some complex query like the one below:

 

let

    // Adjust the Source to your Source:
    Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}, {"Time", type time}, {"IP", type text}}),
    #"Merged Date and Time" = Table.CombineColumns(#"Changed Type", {"Date", "Time"}, (columns) => List.First(columns) & List.Last(columns), "Timestamp"),
    #"**** Input data complete" = "At this point, the input data is complete: only Timestamp and IP.",

    // Add original sort so the data can be sorted back at the very end of the query:
    AddedOriginalSort = Table.AddIndexColumn(#"Merged Date and Time", "OriginalSort", 1, 1),

    SortOnIPAndTimestamp = Table.Sort(AddedOriginalSort,{{"IP", Order.Ascending}, {"Timestamp", Order.Ascending}}),

    #"**** Merge with itself (previous)" = "to get previous data on same row as current data",
    #"Added Index" = Table.AddIndexColumn(SortOnIPAndTimestamp, "Index", 0, 1),
    #"Added Index1" = Table.AddIndexColumn(#"Added Index", "Index.1", 1, 1),
    #"Merged Queries" = Table.NestedJoin(#"Added Index1",{"Index"},#"Added Index1",{"Index.1"},"Previous",JoinKind.LeftOuter),
    #"Expanded Previous" = Table.ExpandTableColumn(#"Merged Queries", "Previous", {"Timestamp", "IP"}, {"Previous.Timestamp", "Previous.IP"}),
    #"Sorted Rows" = Table.Sort(#"Expanded Previous",{{"Index", Order.Ascending}}),
    
    // Determine start of series:
    #"Added Custom" = Table.AddColumn(#"Sorted Rows", "Start", each if [Previous.Timestamp] = null then[Timestamp] else if [IP] = [Previous.IP] and [Timestamp] - [Previous.Timestamp] <= #duration(0,0,3,0) then null else [Timestamp]),

    #"**** Merge with itself (next)" = "to get next data on same row as current data",
    #"Merged Queries1" = Table.NestedJoin(#"Added Custom",{"Index.1"},#"Added Custom",{"Index"},"Next",JoinKind.LeftOuter),
    #"Expanded Next" = Table.ExpandTableColumn(#"Merged Queries1", "Next", {"Timestamp", "IP"}, {"Next.Timestamp", "Next.IP"}),
    #"Added Custom1" = Table.AddColumn(#"Expanded Next", "End", each if [Next.Timestamp] = null then [Timestamp] else if [IP] = [Next.IP] and [Next.Timestamp] - [Timestamp] <= #duration(0,0,3,0) then null else [Timestamp]),

    // Fill Start down and End uo:
    #"Filled Down" = Table.FillDown(#"Added Custom1",{"Start"}),
    #"Filled Up" = Table.FillUp(#"Filled Down",{"End"}),

    // Remove columns that are no longer required:
    #"Removed Other Columns" = Table.SelectColumns(#"Filled Up",{"IP", "Start", "End", "OriginalSort"}),

    // Remove duplicates:
    #"Removed Duplicates" = Table.Distinct(#"Removed Other Columns", {"IP", "Start"}),

    #"**** Again: merge with itself (previous)" = "to get previous data on same row as current data",
    #"Added Index3" = Table.AddIndexColumn(#"Removed Duplicates", "Index", 0, 1),
    #"Added Index4" = Table.AddIndexColumn(#"Added Index3", "Index.1", 1, 1),
    #"Merged Queries2" = Table.NestedJoin(#"Added Index4",{"Index"},#"Added Index4",{"Index.1"},"Previous",JoinKind.LeftOuter),
    #"Expanded Previous1" = Table.ExpandTableColumn(#"Merged Queries2", "Previous", {"IP", "End"}, {"Previous.IP", "Previous.End"}),
    #"Sorted Rows2" = Table.Sort(#"Expanded Previous1",{{"Index", Order.Ascending}}),

    // Only keep a row if the previous End is not within the last hour on the same date for the same IP address,
    #"Added Custom2" = Table.AddColumn(#"Sorted Rows2", "Keep", each [Previous.IP] = null or [IP] <> [Previous.IP] or DateTime.Date([Start]) <> DateTime.Date([Previous.End]) or [Start] - [Previous.End] >= #duration(0,1,0,0)),

    #"**** Again: merge with itself (next)" = "to get next data on same row as current data",
    #"Merged Queries3" = Table.NestedJoin(#"Added Custom2",{"Index.1"},#"Added Custom2",{"Index"},"Next",JoinKind.LeftOuter),
    #"Expanded Next1" = Table.ExpandTableColumn(#"Merged Queries3", "Next", {"End", "IP", "Keep"}, {"Next.End", "Next.IP", "Next.Keep"}),
    // Remove records in which both Keep and Next.Keep are false:
    #"Filtered Rows2" = Table.SelectRows(#"Expanded Next1", each [Keep] or [Next.Keep]),

    // Take the End from the next record if the next record won't be kept:
    #"Added Custom3" = Table.AddColumn(#"Filtered Rows2", "NewEnd", each if [Next.Keep] = null then [End] else if not [Next.Keep] then [Next.End] else [End]),
    #"Filtered Rows" = Table.SelectRows(#"Added Custom3", each [Keep]),
    #"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"End", "Index", "Index.1", "Previous.IP", "Previous.End", "Keep", "Next.End", "Next.IP", "Next.Keep"}),
    #"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"NewEnd", "End"}}),
    #"Filtered Rows1" = Table.SelectRows(#"Renamed Columns", each [Start] <> [End]),
    #"Sorted Rows3" = Table.Sort(#"Filtered Rows1",{{"OriginalSort", Order.Ascending}}),
    #"Removed Columns1" = Table.RemoveColumns(#"Sorted Rows3",{"OriginalSort"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Removed Columns1",{{"Start", type datetime}, {"End", type datetime}})
in
    #"Changed Type1"
Specializing in Power Query Formula Language (M)

View solution in original post

2 REPLIES 2
MarcelBeug
Community Champion
Community Champion

That would become some complex query like the one below:

 

let

    // Adjust the Source to your Source:
    Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}, {"Time", type time}, {"IP", type text}}),
    #"Merged Date and Time" = Table.CombineColumns(#"Changed Type", {"Date", "Time"}, (columns) => List.First(columns) & List.Last(columns), "Timestamp"),
    #"**** Input data complete" = "At this point, the input data is complete: only Timestamp and IP.",

    // Add original sort so the data can be sorted back at the very end of the query:
    AddedOriginalSort = Table.AddIndexColumn(#"Merged Date and Time", "OriginalSort", 1, 1),

    SortOnIPAndTimestamp = Table.Sort(AddedOriginalSort,{{"IP", Order.Ascending}, {"Timestamp", Order.Ascending}}),

    #"**** Merge with itself (previous)" = "to get previous data on same row as current data",
    #"Added Index" = Table.AddIndexColumn(SortOnIPAndTimestamp, "Index", 0, 1),
    #"Added Index1" = Table.AddIndexColumn(#"Added Index", "Index.1", 1, 1),
    #"Merged Queries" = Table.NestedJoin(#"Added Index1",{"Index"},#"Added Index1",{"Index.1"},"Previous",JoinKind.LeftOuter),
    #"Expanded Previous" = Table.ExpandTableColumn(#"Merged Queries", "Previous", {"Timestamp", "IP"}, {"Previous.Timestamp", "Previous.IP"}),
    #"Sorted Rows" = Table.Sort(#"Expanded Previous",{{"Index", Order.Ascending}}),
    
    // Determine start of series:
    #"Added Custom" = Table.AddColumn(#"Sorted Rows", "Start", each if [Previous.Timestamp] = null then[Timestamp] else if [IP] = [Previous.IP] and [Timestamp] - [Previous.Timestamp] <= #duration(0,0,3,0) then null else [Timestamp]),

    #"**** Merge with itself (next)" = "to get next data on same row as current data",
    #"Merged Queries1" = Table.NestedJoin(#"Added Custom",{"Index.1"},#"Added Custom",{"Index"},"Next",JoinKind.LeftOuter),
    #"Expanded Next" = Table.ExpandTableColumn(#"Merged Queries1", "Next", {"Timestamp", "IP"}, {"Next.Timestamp", "Next.IP"}),
    #"Added Custom1" = Table.AddColumn(#"Expanded Next", "End", each if [Next.Timestamp] = null then [Timestamp] else if [IP] = [Next.IP] and [Next.Timestamp] - [Timestamp] <= #duration(0,0,3,0) then null else [Timestamp]),

    // Fill Start down and End uo:
    #"Filled Down" = Table.FillDown(#"Added Custom1",{"Start"}),
    #"Filled Up" = Table.FillUp(#"Filled Down",{"End"}),

    // Remove columns that are no longer required:
    #"Removed Other Columns" = Table.SelectColumns(#"Filled Up",{"IP", "Start", "End", "OriginalSort"}),

    // Remove duplicates:
    #"Removed Duplicates" = Table.Distinct(#"Removed Other Columns", {"IP", "Start"}),

    #"**** Again: merge with itself (previous)" = "to get previous data on same row as current data",
    #"Added Index3" = Table.AddIndexColumn(#"Removed Duplicates", "Index", 0, 1),
    #"Added Index4" = Table.AddIndexColumn(#"Added Index3", "Index.1", 1, 1),
    #"Merged Queries2" = Table.NestedJoin(#"Added Index4",{"Index"},#"Added Index4",{"Index.1"},"Previous",JoinKind.LeftOuter),
    #"Expanded Previous1" = Table.ExpandTableColumn(#"Merged Queries2", "Previous", {"IP", "End"}, {"Previous.IP", "Previous.End"}),
    #"Sorted Rows2" = Table.Sort(#"Expanded Previous1",{{"Index", Order.Ascending}}),

    // Only keep a row if the previous End is not within the last hour on the same date for the same IP address,
    #"Added Custom2" = Table.AddColumn(#"Sorted Rows2", "Keep", each [Previous.IP] = null or [IP] <> [Previous.IP] or DateTime.Date([Start]) <> DateTime.Date([Previous.End]) or [Start] - [Previous.End] >= #duration(0,1,0,0)),

    #"**** Again: merge with itself (next)" = "to get next data on same row as current data",
    #"Merged Queries3" = Table.NestedJoin(#"Added Custom2",{"Index.1"},#"Added Custom2",{"Index"},"Next",JoinKind.LeftOuter),
    #"Expanded Next1" = Table.ExpandTableColumn(#"Merged Queries3", "Next", {"End", "IP", "Keep"}, {"Next.End", "Next.IP", "Next.Keep"}),
    // Remove records in which both Keep and Next.Keep are false:
    #"Filtered Rows2" = Table.SelectRows(#"Expanded Next1", each [Keep] or [Next.Keep]),

    // Take the End from the next record if the next record won't be kept:
    #"Added Custom3" = Table.AddColumn(#"Filtered Rows2", "NewEnd", each if [Next.Keep] = null then [End] else if not [Next.Keep] then [Next.End] else [End]),
    #"Filtered Rows" = Table.SelectRows(#"Added Custom3", each [Keep]),
    #"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"End", "Index", "Index.1", "Previous.IP", "Previous.End", "Keep", "Next.End", "Next.IP", "Next.Keep"}),
    #"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"NewEnd", "End"}}),
    #"Filtered Rows1" = Table.SelectRows(#"Renamed Columns", each [Start] <> [End]),
    #"Sorted Rows3" = Table.Sort(#"Filtered Rows1",{{"OriginalSort", Order.Ascending}}),
    #"Removed Columns1" = Table.RemoveColumns(#"Sorted Rows3",{"OriginalSort"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Removed Columns1",{{"Start", type datetime}, {"End", type datetime}})
in
    #"Changed Type1"
Specializing in Power Query Formula Language (M)

Thank you!

 

Works flawless, toghether with the Gantt chart I have a nice overniew on when camera's are out of service.

You Rock.

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.