Solved: Re: DAX/Group(Bins) for running buckets

vissvess · ‎08-02-2019

Hi,

My data set has a column named "Time" with time stamp values.

When I right click the field "Time" & create new group, it provides me an way to create groups/bins/buckets of equal specifed durations. Apparently, it creates a column in the table with corresponding bucket value matching "Time" column. When I create a continuous bar chart/histogram, the bucktets falls in X axis and count / total for some other fields falls in Y axis. So far everyone aware.

My requirement is that I need the X axis to be a running bucket.

Instead of X axis value as follows,

00:00:00 AM, 3:00:00 AM, 6:00:00 AM, 9:00:00 AM, 12:00:00 PM, 3:00:00 PM, 6:00:00 PM, 9:00:00 PM & 12:00:00 AM.

I need the X axis value to be as follows.

00:00:00 AM - 03:00:00 AM, 01:00:00 AM - 04:00:00 AM, 02:00:00 AM - 05:00:00 AM, and so on till 09:00:00 PM - 12:00:00 AM.

I aware a histogram cannot be created with such X axis. But I can create a clustered/stack column chart with the values as X axis.

Any workaround would help me a lot.

Thanks

Vishvesh R

Anonymous · ‎08-04-2019

This is a sample of the table again (simplified):

Time	Time Bucket	BucketSortOrder
01:10:00	23-2	23
01:10:00	0-3	0
01:10:00	1-4	1
06:20:22	4-7	4
06:20:22	5-8	5
06:20:22	6-9	6

What you have to do is this:

1. Create a table with all your buckets in it. It has to look like this:

Time Bucket	StartInclusive	EndExclusive	BucketSortOrder
00:00:00-03:00:00	00:00:00	03:00:00	0
01:00:00-04:00:00	01:00:00	04:00:00	1
02:00:00-05:00:00	02:00:00	05:00:00	2
03:00:00-06:00:00	03:00:00	06:00:00	3
04:00:00-07:00:00	04:00:00	07:00:00	4
and so on… up to
23:00:00-02:00:00	23:00:00	02:00:00	23

2. Once you have it, you cross-join it with the table that stores your unique times. Cross-joining in PQ (M) is easy: just create a column and type "=Your_Table_Name".

3. After that, you expand the table, so that the expanded columns appear.

4. Then you add another column that will now return TRUE if

[Time] >= [StartInclusive] and [Time] < [EndExclusive].

5. Then you filter the table leaving only the rows where the above is TRUE.

6. Then you remove all the columns apart from [Time], [Time Bucket] and [BucketSortOrder].

7. Now join the [Time] to the original column with the times and do all the things I've explained before...

Best

Darek

View solution in original post

Anonymous · ‎08-02-2019

Mate, you have to create a table (in Power Query) of this structure:

Time Buckets (table)

Time|Time Bucket|BucketSortOrder

01:10:00|00:00:00 AM - 03:00:00 AM|1

01:10:00|01:00:00 AM - 04:00:00 AM|2

...

where

[Time] is a (hidden) column with all the times in your table,

[Time Bucket] is the bucket into which the Time falls (one time will fall into many buckets),

[BucketSortOrder] is a (hidden) column that you'll be sorting the buckets by.

Remember that if a bucket A appears in the table many times, it has to have the same BucketSortOrder.

Once this table's been created, you have to join it to your table on the [Time] column and the relationship will be M2M. Filtering will be one-way only, from [Time Buckets] to your table.

Then you can take the [Time Bucket] column from the Time Buckets table, drop it onto your visual and add your measures. They should aggregate correctly for the time periods/categories.

Best

Darek

vissvess · ‎08-03-2019

@Anonymous ,

Thanks for the inputs.

As per your advise, I need to create a table as shown by you.

For column 1 [Time], from my table I'll reference my existing query, remove other columns & remove duplicate in the specified column.

For column 2 & 3, I was working out the code.

As I have 3 hours bucket, for every timestamp, I'll have three buckets, if I am not wrong.

So If I create 3 columns with column name as sort order (1,23) so that I'll unpivot to create it as specified.

But If I use conditional column, I was not able to achieve the result.

@Zubair_Muhammad & @Anonymous ,

Need your help for the power query stuff.

Thanks

Anonymous · ‎08-04-2019

This is a sample of the table again (simplified):

Time	Time Bucket	BucketSortOrder
01:10:00	23-2	23
01:10:00	0-3	0
01:10:00	1-4	1
06:20:22	4-7	4
06:20:22	5-8	5
06:20:22	6-9	6

What you have to do is this:

1. Create a table with all your buckets in it. It has to look like this:

Time Bucket	StartInclusive	EndExclusive	BucketSortOrder
00:00:00-03:00:00	00:00:00	03:00:00	0
01:00:00-04:00:00	01:00:00	04:00:00	1
02:00:00-05:00:00	02:00:00	05:00:00	2
03:00:00-06:00:00	03:00:00	06:00:00	3
04:00:00-07:00:00	04:00:00	07:00:00	4
and so on… up to
23:00:00-02:00:00	23:00:00	02:00:00	23

2. Once you have it, you cross-join it with the table that stores your unique times. Cross-joining in PQ (M) is easy: just create a column and type "=Your_Table_Name".

3. After that, you expand the table, so that the expanded columns appear.

4. Then you add another column that will now return TRUE if

[Time] >= [StartInclusive] and [Time] < [EndExclusive].

5. Then you filter the table leaving only the rows where the above is TRUE.

6. Then you remove all the columns apart from [Time], [Time Bucket] and [BucketSortOrder].

7. Now join the [Time] to the original column with the times and do all the things I've explained before...

Best

Darek

vissvess · ‎08-05-2019

Hi, @Anonymous ,

The solution is awesome. But, since my table which contains the timestamp column is somewhat huge and for the 12 row new time bucket table merging, the preview generation and applying changes takes nearly 2 hours.

I understand that the reason behind would be inserting and expanding the entire table to each bucket row in new table.

Is there any other workaround for checking entry by entry to reduce the computation?.

Apart from above, Could you please clarify the need for "Bucket sort order" column.

An additional requirement if possible, does it possible to generate the time bucket table automatically based on dynamic bin size and start time. Say, I need the bucket size to be 1.5 hours and start time of buckets is 8:30:00.

If possible, can I provide an way for the user to change the above parameters.

Thanks

Anonymous · ‎08-09-2019

When you say "somewhat huge", how many rows are you talking about?

Best
Darek

Anonymous · ‎08-09-2019

First of all, any kind of bucketing MUST be pre-calculated in advance for this to work the way you want. If you have 10 types of bucketing, you'll need 10 different columns in your table. Easy as that, no workaround is possible because you have to have an x-axis with the buckets and you can't put a measure on the x-axis, only a dimension's values.

Secondly, BucketSortOrder is in there to ensure that your buckets, which are text labels, appear correctly ordered (by time, not alphabetically).

Thirdly, of course the M code required to do the above CAN BE OPTIMIZED. When I get a chance, I'll try to manually craft the script instead of relying on the UI. One optimization I can see immediately is that you could use DISTINCT stamp times only instead the whole of your table. That could reduce the cross join. But even a cross join is not necessary to get what you want but that requires manual intervention. If you know M, then you could do it yourself.

Best
Darek

DAX/Group(Bins) for running buckets

Helpful resources

Microsoft Fabric Learn Together

Power BI Monthly Update - April 2024

Fabric Community Update - April 2024