Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi
i have an issue with something i can do easily in excel but not in Pbi
Of course i wanna do it in Pbi because the real input fil is like 10 millions row and 30 columns of informations
I have search in the forum but could not find a case like mine
basically, the input is a list of event with dates and equipment name
1 | B | C | D | E |
2 | EQT NB | date | Event Text | Event ID |
3 | EQT1 | 01/02/2019 12:43 | Event Text 01 | ? |
4 | EQT2 | 01/02/2019 12:44 | Event Text 02 | ? |
5 | EQT1 | 01/02/2019 12:45 | Event Text 03 | ? |
6 | EQT1 | 02/02/2019 12:46 | Event Text 04 | ? |
7 | EQT2 | 02/02/2019 12:47 | Event Text 05 | ? |
8 | EQT2 | 02/02/2019 13:08 | Event Text 06 | ? |
9 | EQT1 | 03/02/2019 17:08 | Event Text 07 | ? |
10 | EQT1 | 03/02/2019 17:48 | Event Text 08 | ? |
11 | EQT2 | 03/02/2019 18:02 | Event Text 09 | ? |
12 | EQT2 | 03/02/2019 19:00 | Event Text 10 | ? |
13 | EQT2 | 03/02/2019 19:00 | Event Text 11 | ? |
I need to associate an event ID based on the following rules :
for each different equipment, if an event occurs, all events in the next 60 minutes on the same equipment need to be attached to the first event ID
After 61 minutes, a new Event ID can be generated
here is how i do it in excel
1- I reordered the files based on the equipment NB ==> can be done easily on Pbi
2- I transform the date in a count of minutes ==> can be done easily on Pbi
3- I calculate a reference which is incremented only if the "minute count" is more than "previous reference + 60" ... if not i just recopy the previous reference ==> that is what i can not do in Pbi
4- i rename the event ID as "EQT NB" _ "Date" based on a comparison of the reference ==> I didn't do it in Pbi but a guess a simple merge should work...
(i did recopy the formula I used in the last row to help you understand...)
| H | I | J | K | L | M | |
2 | EQT NB | date | Event Text | date in minutes ID | reference | Event ID | |
3 | EQT1 | 01/02/2019 12:43 | Event Text 01 | 10603483 | 10603483 | EQT1_01/02/2019 12:43 | |
4 | EQT1 | 01/02/2019 12:45 | Event Text 03 | 10603485 | 10603483 | EQT1_01/02/2019 12:43 | |
5 | EQT1 | 02/02/2019 12:46 | Event Text 04 | 10604926 | 10604926 | EQT1_02/02/2019 12:46 | |
6 | EQT1 | 03/02/2019 17:08 | Event Text 07 | 10606628 | 10606628 | EQT1_03/02/2019 17:08 | |
7 | EQT1 | 03/02/2019 17:48 | Event Text 08 | 10606668 | 10606628 | EQT1_03/02/2019 17:08 | |
8 | EQT2 | 01/02/2019 12:44 | Event Text 02 | 10603484 | 10603484 | EQT2_01/02/2019 12:44 | |
9 | EQT2 | 02/02/2019 12:47 | Event Text 05 | 10604927 | 10604927 | EQT2_02/02/2019 12:47 | |
10 | EQT2 | 02/02/2019 13:08 | Event Text 06 | 10604948 | 10604927 | EQT2_02/02/2019 12:47 | |
11 | EQT2 | 03/02/2019 18:02 | Event Text 09 | 10606682 | 10606682 | EQT2_03/02/2019 18:02 | |
12 | EQT2 | 03/02/2019 19:00 | Event Text 10 | 10606740 | 10606682 | EQT2_03/02/2019 18:02 | |
13 | EQT2 | 03/02/2019 19:00 | Event Text 11 | 10606740 | 10606682 | EQT2_03/02/2019 18:02 | |
14 | =IF(G13<>G12,J13,IF(J13-K12<60,K12,J13)) | =IF(K13=K12,L12,G13&"_"&H13) |
If any of you has an idea on how to do it it would be really great....
I'm just starting with Pbi so I definitly need some help 😉
thanks in advance
regards
let Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("fc0xDoAgDAXQq5jOJLS/KNDdA5i4Ee5/DTdSDDq/4bVG53ULBZKICJa6CSwp9bCW3Qu8HF50SDYuX5KG4P0kL9OT16LuwfQUY6ylGvOf9Ac=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [#"EQT NB" = _t, date = _t]), #"Changed Type" = Table.TransformColumnTypes(Source,{{"EQT NB", type text}, {"date", type datetime}}), #"Grouped Rows" = Table.Group(#"Changed Type", {"EQT NB"}, {{"AD", each _, type table}}), #"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom", each let Source=[AD], Sort=Table.Sort(Source,{{"date", Order.Ascending}}), #"Added Index" = Table.AddIndexColumn(Sort, "Index", 1, 1), #"Added Index1" = Table.AddIndexColumn(#"Added Index", "Index.1", 0, 1), #"Merged Queries" = Table.NestedJoin(#"Added Index1",{"Index.1"},#"Added Index1",{"Index"},"Added Index1",JoinKind.LeftOuter), #"Expanded Added Index1" = Table.ExpandTableColumn(#"Merged Queries", "Added Index1", {"date"}, {"date.1"}), #"Added Custom" = Table.AddColumn(#"Expanded Added Index1", "Custom", each Duration.TotalMinutes([date]-[date.1])), #"Replaced Value" = Table.ReplaceValue(#"Added Custom",null,0,Replacer.ReplaceValue,{"Custom"}), #"Inserted Merged Column" = Table.AddColumn(#"Replaced Value", "Merged", each Text.Combine({[EQT NB], Text.From([date], "en-US")}, "-"), type text), #"Added Custom1" = Table.AddColumn(#"Inserted Merged Column", "Custom.1", each if [Custom]>60 then ">60" else null), #"Sorted Rows" = Table.Sort(#"Added Custom1",{{"Index", Order.Ascending}}), #"Added Custom2" = Table.AddColumn(#"Sorted Rows", "Custom.2", each if [Index]=1 then [Merged] else if [Custom.1]=">60" then [Merged] else null), #"Filled Down" = Table.FillDown(#"Added Custom2",{"Custom.2"}), #"Removed Other Columns" = Table.SelectColumns(#"Filled Down",{"EQT NB", "date", "Index", "Custom", "Custom.2"}) in #"Removed Other Columns"), #"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"EQT NB", "AD"}), #"Expanded Custom" = Table.ExpandTableColumn(#"Removed Columns", "Custom", {"EQT NB", "date", "Index", "Custom", "Custom.2"}, {"EQT NB", "date", "Index", "Custom.1", "Custom.2"}), #"Renamed Columns" = Table.RenameColumns(#"Expanded Custom",{{"Custom.1", "Minutes in between"}, {"Custom.2", "Event_ID"}}) in #"Renamed Columns"
Hi @LY18 ,
Could you please mark the proper answers as solutions?
Best Regards,
sorry if i didn't answer before ... was on vacation
many thanks for the answer ... its eems very complete and detailed .... i will test it today ...
i'm not very experimented ... how do i "go into DAX" ?
Hi @LY18 ,
You can download the demo from attachment in my last reply. What do you mean "go into DAX"?
Best Regards,
yes i downloaded the file
i'm trying to reproduce it on my actual Database .... it's huge .... power BI edit query is taking a lot of time to calculate each "step"
i was mentioning DAX because there was a step that you name "Steps in the DAX." .... so i asked how to go into DAX .... i'm only manipulating the query editor of POwerBI
hello
first many thanks for your answer
i've tried your methode and it seems to work fine on a small dataset
but as soon as i tried to apply it on the real data set (a csv file of 200Mo) Power BI start to evaluate the query steps, i can see the size of the "evaluation" going up exponentially (up to 1Go) and then powerBI froze
;-(
Hi @LY18,
It's my pleasure.
Since you have a really big dataset, I guess my solution is complex for it. I have another idea. It should be a little case for Python and R. Which one can you apply in your production environment?
Best Regards,
hi,
i have no knowledge on R or Python but i got some colleagues that know R very well
is it easy to introduce a R script inside the query ? ... i saw there is a R visual availlable on powerbi desktop, but i have no idea how to do it
Hi @LY18 ,
It's R and Python in the Query Editor. Please refer to the snapshot below. Actually, I tried R that is quite slow at least in my code. Then I created a possible solution of Python. It runs 20 minutes to add 3.6 M rows of the new index. Maybe you can give it a try.
# 'dataset' holds the input data for this script import pandas as pd import datetime n_rows = len(dataset) i_row = 0 start_eqt = dataset['EQT NB'][0] start_date = datetime.datetime.strptime(dataset['date'][0], "%Y-%m-%dT%H:%M:%S.%f0") new_column = [] while i_row < n_rows: current_date = datetime.datetime.strptime(dataset['date'][i_row], "%Y-%m-%dT%H:%M:%S.%f0") if dataset['EQT NB'][i_row] == start_eqt: if current_date > start_date + datetime.timedelta(seconds=3600): start_date = current_date else: start_eqt = dataset['EQT NB'][i_row] start_date = current_date new_column.append(start_date) i_row += 1 dataset["new_col"] = new_column
Best Regards,
See my article on Mean Time Before Failure (MTBF) which uses EARLIER: http://community.powerbi.com/t5/Community-Blog/Mean-Time-Between-Failure-MTBF-and-Power-BI/ba-p/3395...
well, i guess your answer is in the "uptime = VAR next ...." formula
even if my case is not a MTBF calculation ... maybe it can help me with my eventID
I will need to deep dive into this to understand how to use this in my case
too late today 😉 , i will do this tomorow morning with a fresh brain
i will get back to you with what i understood
thanks
Even with a fresh brain I could not find how to do what I want from your "MTBF" exemple
sorry, my knowledge in PowerBI and complexe formula is still "low" .... i don't get how i can "tranfer" my excel formula into PowerBI
In your exemple you use EARLIER to get the data from previous row but not from the same column (that would be "uptime") .... or maybe i didn't understand...
i tried to add a blank query with this formula (inspired from your post)
reference = VAR EvalReference = EARLIER([reference]) - [reference in minutes] RETURN IF(EvalReference >=60, Earlier([Reference]),[date in minutes ID] )
but didn't work 😉
Can you give me more details?
Many thanks in advance for your time
regards
No one has an idea?
Hi @LY18,
You requirement here is dynamic. Actually, it's a recursion. Please download a solution from the attachment which combines Power Query and DAX. BTW, It isn't that perfect.
1. Power Query steps.
1.1 Add an index.
1.2 Calculate the minutes between the current and the previous line in the same EQT. (Add a new Column)
let lastLine = let currentIndex = [Index], currentEQT = [EQT NB]
in Table.SelectRows(#"Added Index", each [Index] = currentIndex -1 and [EQT NB] = currentEQT)
in if Table.IsEmpty(lastLine)
then 0
else Duration.TotalMinutes([date] - lastLine{0}[date])
1.3 Add a new index which can divide them into different groups.
List.Accumulate(let currentIndex = [Index], currentEQT = [EQT NB]
in Table.SelectRows(#"Added Custom", each [Index] <=currentIndex and [EQT NB] = currentEQT)[Minutes],
1, (seed, value) =>
if value <=60
then seed
else seed + 1)
1.4 Remove some columns.
2 Steps in the DAX.
2.1 Assign every date to a range. (THIS STEP ISN'T THAT SMART).
ranges = VAR currentEqt = [EQT NB] VAR currentIndex = [NewIndex] VAR currentDate = [date] VAR minDate = CALCULATE ( MIN ( Table2[date] ), FILTER ( 'Table2', Table2[EQT NB] = currentEqt && Table2[NewIndex] = currentIndex ) ) VAR minutes = DATEDIFF ( minDate, currentDate, MINUTE ) RETURN SWITCH ( TRUE (), minutes <= 60, minDate, minutes <= 120, minDate + TIME ( 1, 0, 0 ), minutes <= 180, minDate + TIME ( 2, 0, 0 ), minutes <= 240, minDate + TIME ( 3, 0, 0 ), minutes <= 300, minDate + TIME ( 4, 0, 0 ), DATE ( 9999, 12, 31 ) ) // If your data has a larger range, you have to add more branches when you see 9999-12-31.
2.2 Adjust the range to the final result.
Result = VAR currentRanges = [ranges] RETURN CALCULATE ( MIN ( 'Table2'[date] ), FILTER ( 'Table2', Table2[ranges] = currentRanges ) )
Best Regards,
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
110 | |
97 | |
77 | |
63 | |
55 |
User | Count |
---|---|
143 | |
109 | |
89 | |
84 | |
66 |