Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi there.
Below is my table in Power Query
How can I do a "Group By" Email and sum up the ActualtimeSpent (minutes) column? Without using Group By
I tried using group by but its taking way-way too long, over 5 minutes.
Is there an alternative?
Thanks a bunch!
Solved! Go to Solution.
Hi Keith,
This may be a flaw in the code that I provided (hard to say without seens the entire code in your query). Sometimes the code structure causes PQ to run nested queries several times. The one that I've provided last would be more efficient from this perspective.
However, it may be easier to add the cumulative column as we go with the AtcualTimeSpent calcs:
let
Tasks = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("jVnLchQ5EPyVjT5z6JZaTYRv+wQWDJ714sUeDMz6gfEw+PH/hyVaWpRZWR2jCF9akVNVqspKleT1uvu5e9INB33//e+no8P5I9SP0HfnT/4HDfP6/S5/pB8faQkTKmaooHAwVgfxoE/VWwXReizecnw9gAa0FN24R7SUcKcpu/tFMwCWYrb0q4IwpgSgYcFSiek3tYQBzpjfGzB/NGMSZJiKW0Gh1KqAoqT7mXoDsyWTzyx9aP9gKaAl9p0AFIgn9WNESwF5AhkgUCKeAKi6o/VISauWRtxQwo+JQMkHQZ6YiRhg6gE0VksTma0gWn+KtCwpeG5rR+mfEBQWChwTgCJSZcldQnea8ec5DLIEKQAMicUocb9QYo7i7QXIkWagbO7PFksvW/rgVTsoLGRgRBC2I8ZUOuqVNjDkqWjTYYNcHNqQmAPZ22tHCaXJX2uTR2HTGyfh0uRHjjqLpSN1p6U7sjkmEFmK/UIuc0wrjUnPzJWTTNHeld01ZXxAUFroOnBH63RoThVkDk1Nwcr+mMRwrJZGan7aKoL2KOZKRJLyVEETBj5RTDnwv1ra7rildscqhpCngJb2NPBxZw9drDa4YzKSMlbQ3oHn74YuzxiiuFblrTXknRlvW9ruxKmKHHYnoiNOfU+0gcHS2ANoz3l/YnMcqY7Z0j8t58G7lt290zwp505b3J3a2g2OPJ2K2KCl0gdnLbp6ZtcHZy46U3nC0WbGrFu8rVs2t+4aptW13XQgWa0gJgeVKAEIjyVSxgqia8ZIpesBFBdAA4KQQfgxVRCv06ScQe/fOylfQoWFwpQDqKDGhRIjypbGEmEGRZ7sVVoLKmLag7RVQdGAmIQwM2pEVCIVZRQN7jJuF9Se+swoM7oPwqwZNWG6JofJ5w2Cfm7LRi0xVUNm4MbEVUtGF7FNs6UPLUfDB2UMVjO7+9hi6aMVguA8GXzSNOn1/JNawhIjiNQMy11B5no6CD03LbvLIJoSpWc2Wt8gudx0Mv/1wt9NS303timpq7K3fzXfeuxdtCTgQorlTJsX9sdhwZ1l+IKlPfP9haoPTnYZdNnQmZe2oMGRsUubPu++fGlD5TN9xlxpRPqycuUQTlTnyoYanFMhg/CCSANcBUW+J+i15EpSTAFWS6TciQQsW7pu0YHrloRf2/XoPHVca1X0APqsMelB/FnbABseQSQEvbTvjeNOJpYbYYfzdHajLNAnoRvLxeicm19aYsogOjJQYRFE8hxdEN+RnWTeakz6+nBr+3pwBPNWY5IBeNvSmxkUFuSiNMu202cModO2M0+m0RHMrcpc71oyF/0oXZ5BYaGBEUP3fNXCDMJbO8lcBe0dr7bzek8g0bmvWhV9MfjqyIUQZdciPDuHTRL4zpLAm8h3DsOlKrvOPokTLxFEF5wgu/vm7E507pusY4DZ25015I0EdzYz/OiJlkh69WJ2pwqml437Fn2+1x7Xw+6+kwuQPvk9OEogjwEPyrlBAn9QEmBHzZjHFoY/CsvouvQDY0de15B5fNGXh0dLS3pfLD2eQXRbDgsg4q7+g+DRrsv17/w/", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Email = _t, startedAt = _t, finishedAt = _t, #"timeSpent (minutes)" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Table.Combine({Tasks, Tasks, Tasks, Tasks, Tasks, Tasks, Tasks, Tasks, Tasks}),{{"Email", type text}, {"startedAt", type time}, {"finishedAt", type time}, {"timeSpent (minutes)", Int64.Type}}),
Schedule = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMrIyMACiglylWJ1oJWMrYxgvFgA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Task Duration" = _t]),
ChangeType = Table.TransformColumnTypes(Schedule,{{"Task Duration", type time}}),
processStarts = Time.From(ChangeType{0}[Task Duration]),
processEnds = ChangeType{1}[Task Duration],
#"Added latestStartedAt" = Table.AddColumn(#"Changed Type", "latestStartedAt", each List.Max({[startedAt],processStarts}), type time),
#"Added earliestFinishedAt" = Table.AddColumn(#"Added latestStartedAt", "earliestFinishedAt", each List.Min({[finishedAt], processEnds}),type time),
#"Filtered Rows" = Table.SelectRows(#"Added earliestFinishedAt", each [latestStartedAt] < [earliestFinishedAt]),
fCalculate = (t as table) =>
let
m = Table.Buffer(t),
fProcess = (a, n)=>
let
previousFinishedAt = List.Last(a)[earliestFinishedAt],
previousCumul = List.Last(a)[cumul],
currentStartedAt = n[latestStartedAt],
currentFinishedAt = n[earliestFinishedAt],
actualFinish = List.Last(a)[actualFinish],
actualStart = List.Max({actualFinish, currentStartedAt}),
outputRecord = [#"ActualtimeSpent (minutes)" = List.Max({0, Duration.TotalMinutes(currentFinishedAt-actualStart)}), actualFinish = List.Max({actualFinish, currentFinishedAt}),
cumul = previousCumul + #"ActualtimeSpent (minutes)"]
in outputRecord,
process = List.Skip(List.Accumulate(Table.ToRecords(m), {[actualFinish = m{0}[startedAt], #"finishedAt" = m{0}[startedAt], cumul = 0]}, (a, n)=> a & { n & fProcess(a, n) }))
in process,
Group = Table.Group(#"Filtered Rows", "Email", {{"Data", fCalculate}}),
Expand = Table.FromRecords(List.Combine(Group[Data]), Value.Type(Table.AddColumn(Table.AddColumn(#"Changed Type", "ActualtimeSpent (minutes)", each null,type number), "cumul", each null, type number))),
Output = Table.Combine({Expand, Table.RemoveColumns(Table.SelectRows(#"Added earliestFinishedAt", each not ([latestStartedAt] < [earliestFinishedAt])), {"latestStartedAt", "earliestFinishedAt"})})
in
Output
Kind regards,
John
Hi Keith,
This may be a flaw in the code that I provided (hard to say without seens the entire code in your query). Sometimes the code structure causes PQ to run nested queries several times. The one that I've provided last would be more efficient from this perspective.
However, it may be easier to add the cumulative column as we go with the AtcualTimeSpent calcs:
let
Tasks = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("jVnLchQ5EPyVjT5z6JZaTYRv+wQWDJ714sUeDMz6gfEw+PH/hyVaWpRZWR2jCF9akVNVqspKleT1uvu5e9INB33//e+no8P5I9SP0HfnT/4HDfP6/S5/pB8faQkTKmaooHAwVgfxoE/VWwXReizecnw9gAa0FN24R7SUcKcpu/tFMwCWYrb0q4IwpgSgYcFSiek3tYQBzpjfGzB/NGMSZJiKW0Gh1KqAoqT7mXoDsyWTzyx9aP9gKaAl9p0AFIgn9WNESwF5AhkgUCKeAKi6o/VISauWRtxQwo+JQMkHQZ6YiRhg6gE0VksTma0gWn+KtCwpeG5rR+mfEBQWChwTgCJSZcldQnea8ec5DLIEKQAMicUocb9QYo7i7QXIkWagbO7PFksvW/rgVTsoLGRgRBC2I8ZUOuqVNjDkqWjTYYNcHNqQmAPZ22tHCaXJX2uTR2HTGyfh0uRHjjqLpSN1p6U7sjkmEFmK/UIuc0wrjUnPzJWTTNHeld01ZXxAUFroOnBH63RoThVkDk1Nwcr+mMRwrJZGan7aKoL2KOZKRJLyVEETBj5RTDnwv1ra7rildscqhpCngJb2NPBxZw9drDa4YzKSMlbQ3oHn74YuzxiiuFblrTXknRlvW9ruxKmKHHYnoiNOfU+0gcHS2ANoz3l/YnMcqY7Z0j8t58G7lt290zwp505b3J3a2g2OPJ2K2KCl0gdnLbp6ZtcHZy46U3nC0WbGrFu8rVs2t+4aptW13XQgWa0gJgeVKAEIjyVSxgqia8ZIpesBFBdAA4KQQfgxVRCv06ScQe/fOylfQoWFwpQDqKDGhRIjypbGEmEGRZ7sVVoLKmLag7RVQdGAmIQwM2pEVCIVZRQN7jJuF9Se+swoM7oPwqwZNWG6JofJ5w2Cfm7LRi0xVUNm4MbEVUtGF7FNs6UPLUfDB2UMVjO7+9hi6aMVguA8GXzSNOn1/JNawhIjiNQMy11B5no6CD03LbvLIJoSpWc2Wt8gudx0Mv/1wt9NS303timpq7K3fzXfeuxdtCTgQorlTJsX9sdhwZ1l+IKlPfP9haoPTnYZdNnQmZe2oMGRsUubPu++fGlD5TN9xlxpRPqycuUQTlTnyoYanFMhg/CCSANcBUW+J+i15EpSTAFWS6TciQQsW7pu0YHrloRf2/XoPHVca1X0APqsMelB/FnbABseQSQEvbTvjeNOJpYbYYfzdHajLNAnoRvLxeicm19aYsogOjJQYRFE8hxdEN+RnWTeakz6+nBr+3pwBPNWY5IBeNvSmxkUFuSiNMu202cModO2M0+m0RHMrcpc71oyF/0oXZ5BYaGBEUP3fNXCDMJbO8lcBe0dr7bzek8g0bmvWhV9MfjqyIUQZdciPDuHTRL4zpLAm8h3DsOlKrvOPokTLxFEF5wgu/vm7E507pusY4DZ25015I0EdzYz/OiJlkh69WJ2pwqml437Fn2+1x7Xw+6+kwuQPvk9OEogjwEPyrlBAn9QEmBHzZjHFoY/CsvouvQDY0de15B5fNGXh0dLS3pfLD2eQXRbDgsg4q7+g+DRrsv17/w/", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Email = _t, startedAt = _t, finishedAt = _t, #"timeSpent (minutes)" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Table.Combine({Tasks, Tasks, Tasks, Tasks, Tasks, Tasks, Tasks, Tasks, Tasks}),{{"Email", type text}, {"startedAt", type time}, {"finishedAt", type time}, {"timeSpent (minutes)", Int64.Type}}),
Schedule = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMrIyMACiglylWJ1oJWMrYxgvFgA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Task Duration" = _t]),
ChangeType = Table.TransformColumnTypes(Schedule,{{"Task Duration", type time}}),
processStarts = Time.From(ChangeType{0}[Task Duration]),
processEnds = ChangeType{1}[Task Duration],
#"Added latestStartedAt" = Table.AddColumn(#"Changed Type", "latestStartedAt", each List.Max({[startedAt],processStarts}), type time),
#"Added earliestFinishedAt" = Table.AddColumn(#"Added latestStartedAt", "earliestFinishedAt", each List.Min({[finishedAt], processEnds}),type time),
#"Filtered Rows" = Table.SelectRows(#"Added earliestFinishedAt", each [latestStartedAt] < [earliestFinishedAt]),
fCalculate = (t as table) =>
let
m = Table.Buffer(t),
fProcess = (a, n)=>
let
previousFinishedAt = List.Last(a)[earliestFinishedAt],
previousCumul = List.Last(a)[cumul],
currentStartedAt = n[latestStartedAt],
currentFinishedAt = n[earliestFinishedAt],
actualFinish = List.Last(a)[actualFinish],
actualStart = List.Max({actualFinish, currentStartedAt}),
outputRecord = [#"ActualtimeSpent (minutes)" = List.Max({0, Duration.TotalMinutes(currentFinishedAt-actualStart)}), actualFinish = List.Max({actualFinish, currentFinishedAt}),
cumul = previousCumul + #"ActualtimeSpent (minutes)"]
in outputRecord,
process = List.Skip(List.Accumulate(Table.ToRecords(m), {[actualFinish = m{0}[startedAt], #"finishedAt" = m{0}[startedAt], cumul = 0]}, (a, n)=> a & { n & fProcess(a, n) }))
in process,
Group = Table.Group(#"Filtered Rows", "Email", {{"Data", fCalculate}}),
Expand = Table.FromRecords(List.Combine(Group[Data]), Value.Type(Table.AddColumn(Table.AddColumn(#"Changed Type", "ActualtimeSpent (minutes)", each null,type number), "cumul", each null, type number))),
Output = Table.Combine({Expand, Table.RemoveColumns(Table.SelectRows(#"Added earliestFinishedAt", each not ([latestStartedAt] < [earliestFinishedAt])), {"latestStartedAt", "earliestFinishedAt"})})
in
Output
Kind regards,
John
Hi John @jbwtp
Thanks for your solution in the previous post, it was my silly mistake that the performance "slowed down" I'm closing this post as I manage to fix that "mistake"
thanks a bunch!
Hi @Keith011 ,
The most obvious solution would be to just send the table to the data model as-is, create a masure that is SUM(yourTable[actualTimeSpent (minutes)]), then chuck this in a visual with your [email] field.
Power BI will do the group/aggregation for you.
Pete
Proud to be a Datanaut!
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.