Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.
Hi there.
This is my first post - which I'm writing after struggling for many hours to write an efficient measure 🙃. I am a BI developer for a Warehouse, and my task is to calculate the number of seconds that elapse since a previous pallet was loaded to a truck. So how many seconds passed since the last time a user performed a loading transaction.
I was able to accomplish this in Excel by adding calculated columns in excel, but I want to avoid calculated columns for this if I can because my data has many other function types other than Loading, millions of rows, plus I would need to have a consultant add them to the model sadly. That calculated column looked like this:
=CALCULATE(MAX(Table[date_time]), FILTER(Table, EARLIER(Table[pallet_id]) = Table[pallet_id] && EARLIER(Table[date_time]) > Table[date_time]), Table[function] = "Load")
Below is a few minutes of transaction data to show what I'm lookin at. One complication is as you can see, when a pallet is scanned, every sku on the same pallet gets it's own row, with mostly the same date_time stamp.
My goal is to be able to have a table of pallet_id's, and the number of seconds that passed. Does anyone have any suggestions how I could calculate this measure in an efficent manner?
date_time function user truck# pallet_id sku
5:58:26 AM | Load | 1092 | 10 | 101 | a |
5:58:26 AM | Load | 1092 | 10 | 101 | b |
5:58:26 AM | Load | 1092 | 10 | 101 | c |
5:58:26 AM | Load | 1092 | 10 | 101 | d |
5:58:26 AM | Load | 1092 | 10 | 101 | e |
5:58:26 AM | Load | 1092 | 10 | 101 | f |
6:01:55 AM | Load | 1092 | 10 | 102 | a |
6:01:55 AM | Load | 1092 | 10 | 102 | b |
6:01:55 AM | Load | 1092 | 10 | 102 | c |
6:01:55 AM | Load | 1092 | 10 | 102 | d |
6:02:01 AM | Load | 1092 | 9 | 103 | a |
6:02:01 AM | Load | 1092 | 9 | 103 | b |
6:02:01 AM | Load | 1092 | 9 | 103 | c |
6:02:01 AM | Load | 1092 | 9 | 103 | d |
6:02:01 AM | Load | 1092 | 9 | 103 | e |
6:02:16 AM | Load | 1092 | 9 | 104 | a |
6:02:16 AM | Load | 1092 | 9 | 104 | b |
6:02:16 AM | Load | 1092 | 9 | 104 | c |
6:02:16 AM | Load | 1092 | 9 | 104 | d |
6:02:16 AM | Load | 1092 | 9 | 104 | e |
6:02:16 AM | Load | 1092 | 9 | 104 | f |
Solved! Go to Solution.
This calculation will be done for each user separately?
Thanks for your response @v-rongtiep-msft . This isn't what I'm looking for.
See the images of my desired result + calculated Column. I want to get to this without any calculated columns, just a measure:
The measure I used is simply
Hi @NonprofitWizard
Somehow I missed to answer this query. Aplogies for that.
Here is a sample file with the proposed solution https://we.tl/t-Rc90TG1vUz
Measure Tamer =
SUMX (
SUMMARIZE ( 'Table', 'Table'[user], 'Table'[pallet_id] ),
CALCULATE (
VAR CurrentPallet =
SELECTEDVALUE ( 'Table'[pallet_id] )
VAR CurrentUserTable =
CALCULATETABLE ( 'Table', ALLEXCEPT ( 'Table', 'Table'[user] ) )
VAR PreviousPalletsTable =
FILTER ( CurrentUserTable, 'Table'[pallet_id] < CurrentPallet )
VAR PreviousPallet = MAXX ( PreviousPalletsTable, 'Table'[pallet_id] )
VAR PreviousPalletRecord =
FILTER ( PreviousPalletsTable, 'Table'[pallet_id] = PreviousPallet )
VAR CurrentPalletStart =
MIN ( 'Table'[date_time] )
VAR PreviousPalletEnd =
MAXX ( PreviousPalletRecord, 'Table'[date_time] )
RETURN
DATEDIFF ( PreviousPalletEnd, CurrentPalletStart, SECOND )
)
)
Hi @NonprofitWizard ,
Please refer to my pbix file to see if it helps you.
Create a measure.
Measure =
VAR _min =
CALCULATE (
MIN ( 'Table'[date_time] ),
FILTER (
ALL ( 'Table' ),
'Table'[user] = SELECTEDVALUE ( 'Table'[user] )
&& 'Table'[sku] = SELECTEDVALUE ( 'Table'[sku] )
)
)
VAR _next =
CALCULATE (
MAX ( 'Table'[date_time] ),
FILTER (
ALL ( 'Table' ),
'Table'[rankx]
= SELECTEDVALUE ( 'Table'[rankx] ) - 1
&& 'Table'[sku] = SELECTEDVALUE ( 'Table'[sku] )
)
)
VAR _now =
MAX ( 'Table'[date_time] )
RETURN
IF ( MAX ( 'Table'[date_time] ) = _min, 0, DATEDIFF ( _next, _now, SECOND ) )
If I have misunderstood your meaning, please provide y more details with your desired output.
Best Regards
Community Support Team _ Polly
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Thanks for your response @v-rongtiep-msft . This isn't what I'm looking for.
See the images of my desired result + calculated Column. I want to get to this without any calculated columns, just a measure:
The measure I used is simply
This wasn't the solution, not sure why it's labeled that way
Hi @NonprofitWizard
Somehow I missed to answer this query. Aplogies for that.
Here is a sample file with the proposed solution https://we.tl/t-Rc90TG1vUz
Measure Tamer =
SUMX (
SUMMARIZE ( 'Table', 'Table'[user], 'Table'[pallet_id] ),
CALCULATE (
VAR CurrentPallet =
SELECTEDVALUE ( 'Table'[pallet_id] )
VAR CurrentUserTable =
CALCULATETABLE ( 'Table', ALLEXCEPT ( 'Table', 'Table'[user] ) )
VAR PreviousPalletsTable =
FILTER ( CurrentUserTable, 'Table'[pallet_id] < CurrentPallet )
VAR PreviousPallet = MAXX ( PreviousPalletsTable, 'Table'[pallet_id] )
VAR PreviousPalletRecord =
FILTER ( PreviousPalletsTable, 'Table'[pallet_id] = PreviousPallet )
VAR CurrentPalletStart =
MIN ( 'Table'[date_time] )
VAR PreviousPalletEnd =
MAXX ( PreviousPalletRecord, 'Table'[date_time] )
RETURN
DATEDIFF ( PreviousPalletEnd, CurrentPalletStart, SECOND )
)
)
Okay this works in my simulation - but why does my actual model run out of resources after 5 minutes of calculating😑. The entire facts table only has 4.2 million records, about 10% of which are Load records.
Maybe I need to message the consultant and figure out what's up.
I'm going to mark this as a solution for others even though it doesn't work for me. Thank you so much...
would you please present the expected result let's say for pallet "a"?
Apoligies, I'm not sure the best way to format the tables... pallet_id's are actually labeled 101 - 104 in my examples, a-f are for the sku.
Like this:
Pallet_ID Seconds Since Last Load
102 | 209 |
103 | 6 |
104 | 15 |
Note how in the sample data, pallet_id 104 was loaded at 6:02:16 AM, while pallet_id 103 was the most recent pallet loaded 6:02:01 AM. These seconds that elapsed between the pallets was 15.
For pallet 101 there is no value because it was the first pallet loaded.
That's correct, time between each users transactions is probably the most important data I want to see. Then I'd eventually want to also display time between each consectuve pallet by truck#, etc.
This is what I have in a pivot table using the calculated columns:
Date | user_id | truck_# | Pallet Tag | Load Duration (Seconds) |
####### | 1092 | 10 | 111 | |
112 | 223 | |||
113 | 6 | |||
114 | 209 | |||
5028 | 9 | 201 | ||
202 | 95 | |||
203 | 27 |
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
54 | |
27 | |
23 | |
14 | |
11 |
User | Count |
---|---|
77 | |
63 | |
44 | |
17 | |
12 |