cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Frequent Visitor

Grouping rows with values withing certain range of one another

Hello Everybody,

 

I'm doing some data preparation/cleansing and as part of that I need to group rows where time values are within 30 minutes of one another. To help explain further I have added some simplified input data below:

 

Client IDPlanned Start DatePlanned Start TimePlanned Duration
120/07/202009:0030
121/07/202009:0030
122/07/202009:0030
123/07/202009:0030
124/07/202009:0030
125/07/202008:4530
126/07/202009:1530
120/07/202012:0030
121/07/202012:0030
122/07/202012:0030
123/07/202012:0030
124/07/202011:3030
125/07/202012:3030
126/07/202012:3030

 

I can use Table.Group to group all rows which have the same Client ID, Planned Start Time and Planned Duration:

 

 

= Table.Group(#"Changed Type", {"Client ID", "Planned Start Time", "Planned Duration"}, {{"data", each _, type table [Client ID=nullable number, Planned Start Date=nullable date, Planned Start Time=nullable time, Planned Duration=nullable number]}, {"Count", each Table.RowCount(_), Int64.Type}, {"Median Start Time", each List.Median([Planned Start Time]), type nullable time}})

 

Which gives this output:

Client IDPlanned Start TimePlanned DurationdataCountMedian Start Time
109:00:0030table509:00:00
108:45:0030table108:45:00
109:15:0030table109:15:00
112:00:0030table412:00:00
111:30:0030table111:30:00
112:30:0030table212:30:00

 

This is working as expected, but I would like to group rows where the Planned Start Times are within 30 minutes of one another.

This would result in the first three rows being grouped into one: Planned Start Time 09:00, 08:45, 09:15.

The Median Start Time aggregation would then show the median of these times.

 

The last three rows are a bit more complicated since the start times are 12:00, 11:30 and 12:30.

The first two rows (12:00, 11:30) are within 30 minutes of each other, so they could be grouped to form a group of 5 rows.

However, if the 12:00 and 12:30 rows could be grouped that would be even better as it would form a group of 6 rows.

 

I probably need to add a helper column to identify which rows to merge. Any help would be greatly appreciated.

 

Although I've been doing this in power query, it would be fine to do it in DAX if it would be easier.

 

I've uploaded a sample pbix here: https://ufile.io/xyd0n3u9 

 

Many thanks

2 REPLIES 2
Highlighted
Super User III
Super User III

FYI that Edge flagged your download, so I made similar mock data.  Not sure this exactly meets your need but it basically rounds down to the nearest half hour before grouping.  To see how it works, just create a blank query, go to Advanced Editor, and replace the text there with the M code below.

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUbK0MjawMjAAsowNlGJ1YIImphiChobYlBoa4RA1QBKNBQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Client = _t, #"Planned Start Time" = _t, Duration = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Client", Int64.Type}, {"Planned Start Time", type time}, {"Duration", Int64.Type}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type", "Round30Minutes", each #time(Time.Hour([Planned Start Time]), if Time.Minute([Planned Start Time])>30 then 30 else 0, 0)),
    #"Grouped Rows" = Table.Group(#"Added Custom", {"Round30Minutes", "Client", "Duration"}, {{"AllRows", each _, type table [Client=nullable number, Planned Start Time=nullable time, Duration=nullable number, Round30Minutes=time]}})
in
    #"Grouped Rows"

 

If this works for you, please mark it as the solution.  Kudos are appreciated too.  Please let me know if not.

Regards,

Pat





Did I answer your question? Please mark my post as a solution! Kudos are also appreciated!

Proud to be a Super User!




Highlighted

Hi,

 

Thank you for this - I added similar code to my query. It group more values, but it won't group values which cross the half-hour boundaries - for example 11:55 and 12:05 won't be grouped, althought they are only 10 minutes apart. Any suggestion on how I could group these as well?

 

Many thanks

Helpful resources

Announcements
Community Conference

Power Platform Community Conference

Check out the on demand sessions that are available now!

Community Conference

Microsoft Power Platform Communities

Check out the Winners!

secondImage

Create an end-to-end data and analytics solution

Learn how Power BI works with the latest Azure data and analytics innovations at the digital event with Microsoft CEO Satya Nadella.

Top Solution Authors
Top Kudoed Authors