Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hello Everybody,
I'm doing some data preparation/cleansing and as part of that I need to group rows where time values are within 30 minutes of one another. To help explain further I have added some simplified input data below:
Client ID | Planned Start Date | Planned Start Time | Planned Duration |
1 | 20/07/2020 | 09:00 | 30 |
1 | 21/07/2020 | 09:00 | 30 |
1 | 22/07/2020 | 09:00 | 30 |
1 | 23/07/2020 | 09:00 | 30 |
1 | 24/07/2020 | 09:00 | 30 |
1 | 25/07/2020 | 08:45 | 30 |
1 | 26/07/2020 | 09:15 | 30 |
1 | 20/07/2020 | 12:00 | 30 |
1 | 21/07/2020 | 12:00 | 30 |
1 | 22/07/2020 | 12:00 | 30 |
1 | 23/07/2020 | 12:00 | 30 |
1 | 24/07/2020 | 11:30 | 30 |
1 | 25/07/2020 | 12:30 | 30 |
1 | 26/07/2020 | 12:30 | 30 |
I can use Table.Group to group all rows which have the same Client ID, Planned Start Time and Planned Duration:
= Table.Group(#"Changed Type", {"Client ID", "Planned Start Time", "Planned Duration"}, {{"data", each _, type table [Client ID=nullable number, Planned Start Date=nullable date, Planned Start Time=nullable time, Planned Duration=nullable number]}, {"Count", each Table.RowCount(_), Int64.Type}, {"Median Start Time", each List.Median([Planned Start Time]), type nullable time}})
Which gives this output:
Client ID | Planned Start Time | Planned Duration | data | Count | Median Start Time |
1 | 09:00:00 | 30 | table | 5 | 09:00:00 |
1 | 08:45:00 | 30 | table | 1 | 08:45:00 |
1 | 09:15:00 | 30 | table | 1 | 09:15:00 |
1 | 12:00:00 | 30 | table | 4 | 12:00:00 |
1 | 11:30:00 | 30 | table | 1 | 11:30:00 |
1 | 12:30:00 | 30 | table | 2 | 12:30:00 |
This is working as expected, but I would like to group rows where the Planned Start Times are within 30 minutes of one another.
This would result in the first three rows being grouped into one: Planned Start Time 09:00, 08:45, 09:15.
The Median Start Time aggregation would then show the median of these times.
The last three rows are a bit more complicated since the start times are 12:00, 11:30 and 12:30.
The first two rows (12:00, 11:30) are within 30 minutes of each other, so they could be grouped to form a group of 5 rows.
However, if the 12:00 and 12:30 rows could be grouped that would be even better as it would form a group of 6 rows.
I probably need to add a helper column to identify which rows to merge. Any help would be greatly appreciated.
Although I've been doing this in power query, it would be fine to do it in DAX if it would be easier.
I've uploaded a sample pbix here: https://ufile.io/xyd0n3u9
Many thanks
FYI that Edge flagged your download, so I made similar mock data. Not sure this exactly meets your need but it basically rounds down to the nearest half hour before grouping. To see how it works, just create a blank query, go to Advanced Editor, and replace the text there with the M code below.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUbK0MjawMjAAsowNlGJ1YIImphiChobYlBoa4RA1QBKNBQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Client = _t, #"Planned Start Time" = _t, Duration = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Client", Int64.Type}, {"Planned Start Time", type time}, {"Duration", Int64.Type}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Round30Minutes", each #time(Time.Hour([Planned Start Time]), if Time.Minute([Planned Start Time])>30 then 30 else 0, 0)),
#"Grouped Rows" = Table.Group(#"Added Custom", {"Round30Minutes", "Client", "Duration"}, {{"AllRows", each _, type table [Client=nullable number, Planned Start Time=nullable time, Duration=nullable number, Round30Minutes=time]}})
in
#"Grouped Rows"
If this works for you, please mark it as the solution. Kudos are appreciated too. Please let me know if not.
Regards,
Pat
To learn more about Power BI, follow me on Twitter or subscribe on YouTube.
Hi,
Thank you for this - I added similar code to my query. It group more values, but it won't group values which cross the half-hour boundaries - for example 11:55 and 12:05 won't be grouped, althought they are only 10 minutes apart. Any suggestion on how I could group these as well?
Many thanks
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
113 | |
97 | |
84 | |
67 | |
60 |
User | Count |
---|---|
150 | |
120 | |
99 | |
87 | |
68 |