cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Highlighted
Post Patron
Post Patron

Help to solve the below logic using power bi

Dear,

 

I am trying to build the below logic in powerbi. Can anyone help on this, I tried and able to get the result only one part. Data is about millions, and need to think about performance as well.

 

This is the wifi access analysis I am doing. I have a date field , Name and status (Success, Failure and Incomplete). Below the points to put in the logic

 

  • If a person has got more than one entry within 4 hours of gap in datetime (eg: John), if status success is there, it should pick only the records with "Success" alone (Eg: There are four records for John, out of 3 (First 3) are in the gap of 4 hours, so it should be in one group, since "Success" is availabe in one record, only this record should be in o/p. Johns, fourth record with status "Incomplete" should be considered as separate group since it is more than 4 hours gap from the previous record and so this record will be in output.
  • Paul has got two records, both are not 4 hours gap of the datetime, so both will be in a group and only one record should go to outputs, since success is not there, next priority is for the failure, and the record with Failure status will go to output

 

                       Input                                                                          Desired O/P

 

DateTimeNameStatus DateTimeNameStatus
2/2/2017 18:58JohnCompleted 2/2/2017 18:58JohnCompleted
2/2/2017 19:57JohnFailed 4/2/2017 18:56JohnIncomplete
2/2/2017 20:56JohnIncomplete
 
   
 
2/2/2017 18:56PaulFailed
4/2/2017 18:56JohnIncomplete2/2/2017 18:53MathewCompleted
2/2/2017 18:56PaulFailed2/2/2017 18:51JoseIncomplete
2/2/2017 18:55PaulIncomplete2/2/2017 18:51AniCompleted
2/2/2017 18:53MathewCompleted    
2/2/2017 18:52MathewFailed    
2/2/2017 18:51JoseIncomplete    
2/2/2017 18:51AniCompleted    

 

Br,

SHams

5 REPLIES 5
Highlighted
Community Champion
Community Champion

If I interpret the logic correctly, then you need:

all records with status "Completed", and

all records with status "Failed"

       if there is no record for the same person with status "Completed" within +/- 4 hours

all records with status "Incomplete"

       if there is no record for the same person with status "Completed" or with status "Failed" within +/- 4 hours

 

I put your data in an Excel file and created the Power Query (M) code below.
To be honest I have my doubts about performance, but you may give it a try on some test data.

 

let
    Source = Excel.Workbook(File.Contents("C:\Users\Marcel\Documents\Forum bijdragen\Power BI Community\Help solve logic.xlsx"), null, true),
    Input_Table = Source{[Item="Input",Kind="Table"]}[Data],
    Typed = Table.TransformColumnTypes(Input_Table,{{"DateTime", type datetime}, {"Name", type text}, {"Status", type text}}),
    #"Added Custom" = Table.AddColumn(Typed, "Keep", 
                                        (x) => if x[Status] = "Completed" 
                                               then true
                                               else if x[Status] = "Failed"
                                                    then 0 = Table.RowCount(Table.SelectRows(Typed, each [Name]     =  x[Name] and
                                                                                                         [Status]   =  "Completed" and 
                                                                                                         [DateTime] >= x[DateTime] - #duration(0,4,0,0) and
                                                                                                         [DateTime] <= x[DateTime] + #duration(0,4,0,0)))
                                                    else 0 = Table.RowCount(Table.SelectRows(Typed, each [Name]     =  x[Name] and
                                                                                                         ([Status]  =  "Completed" or [Status] = "Failed") and 
                                                                                                         [DateTime] >= x[DateTime] - #duration(0,4,0,0) and
                                                                                                         [DateTime] <= x[DateTime] + #duration(0,4,0,0)))
                                        ),
    #"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([Keep] = true)),
    #"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Keep"})
in
    #"Removed Columns"

 

Specializing in Power Query Formula Language (M)
Highlighted
Super User III
Super User III

Hi @shamsuddeenvp what I would do is the following:

 

  • Create another column in which you can calculate the 4 hour increments. As this is one of the foundations for your data.
  • I would then use this to get a count or distinct count of each user.
  • And finally from there you can then create a calculated measure which would then be able to show what you require?

And I would suggest if you are looking for performance as @MarcelBeug suggested to potentially the column in the Query Editor but the rest as Calculated columns, so that you can leverage the performance of the Vertipaq engine.





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!







Power BI Blog

Highlighted

Thanks a lot @MarcelBeug

 

This is the logic what I am exactly looking for. Great.

 

However, it takes long time since my data is more (in lak)..  I will have to try some alternate solution.

 

Br,

Shams

Highlighted

I had done this using DAX queries.

 

Br,

Shams

Highlighted

To address your performace issues, you need to split your date column in to two columns.

 

The other should carry the Hour or Minute of the day as an integer.  This will make a massive difference over the size of your model and make loading and calculations much faster.

 

Then you can do the rest in DAX or PQ.  When the actions are within 4 hours on the same day it's easy.  If they split a day, you just need a basic IF statement to handle that.

 

If you have millions of rows, you have to avoid having columns that are highly unique like your datetime column.


To learn more about DAX visit : aka.ms/practicalDAX

Proud to be a Datanaut!

Helpful resources

Announcements
Community Conference

Power Platform Community Conference

Check out the on demand sessions that are available now!

Community Conference

Microsoft Power Platform Communities

Check out the Winners!

secondImage

Create an end-to-end data and analytics solution

Learn how Power BI works with the latest Azure data and analytics innovations at the digital event with Microsoft CEO Satya Nadella.

Top Solution Authors
Top Kudoed Authors