Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi All,
I am facing another problem with my data set. I am attaching an image for a sample of my dataset.
The data set contains the time spent by children in a session on different resources. I need to find the time difference between the end time of a resource id and the start time of the next resource for the same session.
I just need to find out how much time do students spend while moving from one resource to another.
Its urgent so please help me as soon as you can.
Hi,
Is your problem solved? What can I do for you?
Best Regards!
Dale
Hi Kumar, no hurry. Please take your time. Hope you will get well soon.
Best Regards!
Dale
Hi @v-jiascu-msft,
I tried doing it on my data set. My data set is slightly big. There are about 2 mil rows. When I did it on my data set, it was taking too much time. I entered the formula and for two hours it just said "Working On It", but it never did :-/. Is it because the data set is big or some other reason. My desktop has more than decent hardware ( 8 Gb ram, octa-core processor). But I did on a smaller data set it happened very quickly.
Please help me. what might be the reason. Is there any other way can do the same and it happens very quickly.
Hi @Kumar11109,
I had a test about 500k. It was very slow. I think the main cause is that we put a complicated formula on a large dataset. Let's split the formula into two parts as calculated columns.
NextStartTime = CALCULATE ( MIN ( Table1[Start Time] ), FILTER ( Table1, Table1[Session ID] = EARLIER ( Table1[Session ID] ) && Table1[Index] = EARLIER ( Table1[Index] ) + 1 ) )
TimeSpent = IF ( ISBLANK ( [NextStartTime] ), BLANK (), [NextStartTime] - Table1[End Time] )
Best Regards!
Dale
HI @v-jiascu-msft Dale,
I tried using your formula for that you provided me, but this time what happened was that it only calculated the next time for only 1 resource of a session and because of this it only calculated the idle time between for only those resources. I am attaching an image for the same. Also, the formula that you had shared, it was slightly different from the one you had shared earlier.
NextStartTime = CALCULATE ( MIN ( Table1[Start Time] ), FILTER ( Table1, Table1[Session ID] = EARLIER ( Table1[Session ID] ) && Table1[Index] = EARLIER ( Table1[Index] ) + 1 ) )
TimeSpent = IF ( ISBLANK ( [NextStartTime] ), BLANK (), [NextStartTime] - Table1[End Time] )
I tried doing it with the old formula, only separating the time spent and next time and didn't work again.
NextStartTime = CALCULATE ( MIN ( Table1[Start Time] ), FILTER ( Table1, Table1[Index] >= EARLIER ( Table1[Index] ) + 1 && Table1[Session ID] = EARLIER ( Table1[Session ID] ) ))
TimeSpent = IF ( ISBLANK ( NextStartTime ), BLANK (), NextStartTime - Table1[End Time] )
Can we further break down formula? Also, can you explain me the formula?
Thank You,
Kumar Ashwarya
Hi Kumar,
There is two preconditions that we evalute the time spent in one session and the data has orders. Have look at the picture. Let's take session 1 and 2 as example. Time Spent = [the next start time] - [the end time]. The "Time Spent" from session 1 to 2 will be empty. (row 5)
In your latest post, you order the table by "resourceid", so it looks strange. Please reorder it as the orginal one (like below). It's clear there is three moves in session 1 and there will be three "time spent".
I split one formula into two to get high performance.
Best Regards!
Dale
I did not sort it this time, but its still working the same way.
Look at the formula that I have applied, I hope it's not incorrect.
Hi Kumar,
You can see the index is disordered. That's why the result is wrong. If we don't use an index, the performance will be terrible due to we have to filter and compare. Let's add a useful index.
1. Open "Query Editor", select "Session ID", sort a->z;
2. Click to open "Advanced Editor", add ", {"Start Time", Order.Ascending}" as the picture showed;
3. Click "Done", we will see an arrow in "session ID" and an arrow in "Start Time";
4. Add index now.
Why we do this? There is an order along time when people move from one resource to another in one session. If this is true, it would work. We can use (index + 1) to get the next "Start Time" without comparing many "Start Time" to get the most recent "Start Time", which is consuming performance.
Best Regards!
Dale
One thing, my time format also contains the date. I hope this won't cause a problem???
Hi @Kumar11109
I entered the following data to create your table
Session ID Resource ID Start Time End Time 1 11 18:15:30 18:17:35 1 12 18:19:17 18:21:21 1 13 18:23:04 18:25:07 1 12 18:26:51 18:28:53 2 11 18:30:38 18:32:39 2 13 18:34:25 18:36:25 2 12 18:38:12 18:40:11 3 11 18:41:59 18:43:57 3 12 18:45:46 18:47:43 3 13 18:49:33 18:51:29 4 15 18:53:20 18:55:15 4 11 18:57:05 18:59:01 4 15 19:00:54 19:02:47 4 12 19:04:41 19:06:33
and then applied these steps in the Query Editor
let Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("ZZBbDoQgDEX3wrcfpQ+QuxXj/rcxFa+DmUmIOamH9tLjKLVspc7PjhowIXZYlHOjoKyO/HGj1jxLMFYN4sSA9L8O2hCcpjvCpqCvDCawnaiwsQSOMM/WxHbhV+AIy5xEF9Q7pL1GeEUMoiH6Ep5rAW/Ens4SmMEHjBi5hzvkfDeThUG5yYjc6xKYITrkcQek/nTIkiCcqBljCcqq50OI7Ypznh8=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [#"Session ID" = _t, #"Resource ID" = _t, #"Start Time" = _t, #"End Time" = _t]), #"Changed Type" = Table.TransformColumnTypes(Source,{{"Session ID", Int64.Type}, {"Resource ID", Int64.Type}, {"Start Time", type time}, {"End Time", type time}}), #"Sorted Rows" = Table.Sort(#"Changed Type",{{"Session ID", Order.Ascending}, {"Resource ID", Order.Ascending}, {"End Time", Order.Ascending}}), #"Added Index" = Table.AddIndexColumn(#"Sorted Rows", "Index", 0, 1), #"Added Index1" = Table.AddIndexColumn(#"Added Index", "Index.1", 1, 1), #"Merged Queries" = Table.NestedJoin(#"Added Index1",{"Index.1"},#"Added Index1",{"Index"},"NewColumn",JoinKind.LeftOuter), #"Expanded NewColumn1" = Table.ExpandTableColumn(#"Merged Queries", "NewColumn", {"Session ID", "Resource ID", "Start Time"}, {"NewColumn.Session ID", "NewColumn.Resource ID", "NewColumn.Start Time"}), #"Added Custom" = Table.AddColumn(#"Expanded NewColumn1", "Custom", each if ([Session ID] = [NewColumn.Session ID] and [Resource ID] = [NewColumn.Resource ID]) then [NewColumn.Start Time] - [End Time] else 0), #"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Index", "Index.1", "NewColumn.Session ID", "NewColumn.Resource ID", "NewColumn.Start Time"}), #"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Custom", "Time Spent"}}) in #"Renamed Columns"
Which resulted in the following table
@Phil_SeamarkI don't think so this is what I want. If you look at the column "Time Spent" and the difference between end time for resource 11 and start time for resource 12 is 00:1:53 seconds and not 5 mins and 30 seconds.
This is the problem, I hope I am a bit more clear now.
Thank You for your quick reply BTW.
And it would be really appreciated if we can do this without going into the query, if possible and by just adding a column.
Thank You,
Kumar Ashwarya
Hi Kumar Ashwarya,
According to your conditions, the records in this table should have an order. So firstly, we add an index column. Then we can use this formula in calculated column to get the time spent.
TimeSpent = VAR NextStartTime = CALCULATE ( MIN ( Table1[Start Time] ), FILTER ( Table1, Table1[Index] >= EARLIER ( Table1[Index] ) + 1 && Table1[Session ID] = EARLIER ( Table1[Session ID] ) ) ) RETURN IF ( ISBLANK ( NextStartTime ), BLANK (), NextStartTime - Table1[End Time] )
Ps: 1. Column NextStartTime is just to show the process, you don’t need to add it.
2. Pay attention to the data type.
I tried using your method, but it is showing errors.
The error is "Token Eof expected".
Do I need to create this column out of the query or in the query editor only?
Hi,
We add index in the Query Editor. And then add a calculated column in the table with the formula. These two steps are performed in different place. The error showed whether you were in the wrong places or you missed some parenthesis (usually right parenthesis). Please check it out.
Best Regards!
Dale
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
109 | |
99 | |
77 | |
66 | |
54 |
User | Count |
---|---|
144 | |
104 | |
102 | |
87 | |
64 |