Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
peteru9067
Helper III
Helper III

Count duplicates only once if within a certain duration i.e 24hrs

I am trying to create a dashboard of the # of Trips in my facility by day, month, and year. It may look simple however, sometimes during startup after a plant outage, a trip may occur several times during this period before getting the plant stable and running. 

 

I would like to filter out these nuisance trips or count them as one if it occurs within a 24hr period. 

 

So in the table below what I should expect or see on my dashboard is 

LXSHH-0002 occuring 2 times (once in 7/29 and again in 8/29)

FXSHH-0012 occuring 2 times in Feb and Mar

LXSHH-0004 occuring once

 

unitIPL TagStatusStartEndDuration
U30LXSHH-0002Trips7/29/2021 1:107/29/2021 1:100
U30LXSHH-0002Trips7/29/2021 1:137/29/2021 3:026537.739
U30LXSHH-0002Trips7/29/2021 3:567/29/2021 3:560
U30LXSHH-0002Trips7/29/2021 4:187/29/2021 4:20152.612
U30LXSHH-0002Trips7/29/2021 17:217/29/2021 17:210
U05FXSHH-0012Trips2/20/2021 6:082/20/2021 6:271126.646
U05FXSHH-0012Trips3/9/2021 19:103/9/2021 19:16383.140991
U05FXSHH-0012Trips3/9/2021 19:203/9/2021 19:27410.959992
U30LXSHH-0004Trips4/11/2021 9:444/15/2021 18:44377973.975
U30LXSHH-0002Trips8/29/2021 4:338/29/2021 4:3465.6090088
U30LXSHH-0002Trips8/29/2021 4:388/29/2021 4:380
U30LXSHH-0002Trips8/29/2021 4:418/29/2021 4:410
U30LXSHH-0002Trips8/29/2021 4:548/29/2021 4:5566.9769897
U30LXSHH-0002Trips8/29/2021 4:598/29/2021 4:590
2 ACCEPTED SOLUTIONS
smpa01
Super User
Super User

@peteru9067  does this work for you?

Measure =
VAR _0 =
    GENERATE (
        'Table',
        DATESBETWEEN ( 'Calendar'[Date], 'Table'[Start], 'Table'[End] )
    )
VAR _1 =
    SUMMARIZE ( _0, 'Table'[IPL Tag], 'Calendar'[Date] )
VAR _3 =
    SUMMARIZE (
        NATURALINNERJOIN ( _1, 'Calendar' ),
        'Calendar'[Year],
        'Calendar'[Month],
        'Table'[IPL Tag]
    )
RETURN
    COUNTX ( _3, [IPL Tag] )

 

smpa01_1-1638286229308.pngsmpa01_2-1638286242534.png

 

 

pbix is attached 

Did I answer your question? Mark my post as a solution!
Proud to be a Super User!
My custom visualization projects
Plotting Live Sound: Viz1
Beautiful News:Viz1, Viz2, Viz3
Visual Capitalist: Working Hrs

View solution in original post

@peteru9067  can you try this

Measure4 = 
VAR _0 =
    SUMMARIZE ( 'Table', 'Table'[IPL Tag], 'Table'[Start] )
VAR _1 =
    ADDCOLUMNS (
        _0,
        "rank", RANKX ( FILTER ( _0, [IPL Tag] = EARLIER ( [IPL Tag] ) ), [Start],, ASC )
    )
VAR _2 =
    FILTER (
        ADDCOLUMNS (
            _1,
            "diff",
                DATEDIFF (
                    MAXX (
                        FILTER (
                            _1,
                            [IPL Tag] = EARLIER ( [IPL Tag] )
                                && [rank]
                                    = EARLIER ( [rank] ) - 1
                        ),
                        [Start]
                    ),
                    [Start],
                    HOUR
                )
        ),
        [diff]
            == BLANK ()
                || [diff] >= 24
    )
RETURN
    COUNTX ( _2, [IPL Tag] )
Did I answer your question? Mark my post as a solution!
Proud to be a Super User!
My custom visualization projects
Plotting Live Sound: Viz1
Beautiful News:Viz1, Viz2, Viz3
Visual Capitalist: Working Hrs

View solution in original post

10 REPLIES 10
smpa01
Super User
Super User

@peteru9067  does this work for you?

Measure =
VAR _0 =
    GENERATE (
        'Table',
        DATESBETWEEN ( 'Calendar'[Date], 'Table'[Start], 'Table'[End] )
    )
VAR _1 =
    SUMMARIZE ( _0, 'Table'[IPL Tag], 'Calendar'[Date] )
VAR _3 =
    SUMMARIZE (
        NATURALINNERJOIN ( _1, 'Calendar' ),
        'Calendar'[Year],
        'Calendar'[Month],
        'Table'[IPL Tag]
    )
RETURN
    COUNTX ( _3, [IPL Tag] )

 

smpa01_1-1638286229308.pngsmpa01_2-1638286242534.png

 

 

pbix is attached 

Did I answer your question? Mark my post as a solution!
Proud to be a Super User!
My custom visualization projects
Plotting Live Sound: Viz1
Beautiful News:Viz1, Viz2, Viz3
Visual Capitalist: Working Hrs

Hi Sorry to bother you..... the code did not exactly work after further review of my huge dataset. It counted the number of occurrences for each tag by day....... I wanted to only count the tag once if it occured within a 24hr time frame.

I am assuming it should only look at the Start and go down row by row as long as the start time is still within 24hr from the initial start time count that tag as 1. once outside the 24 hr then increment that tag by another one.

@peteru9067  so you are saying that in the following case, the code should expand everything inbetween 4/11/2021 and 4/15/2021 and see if they are all consecutive dates and if yes, then should be counted as 1?

 

U30 LXSHH-0004 Trips 4/11/2021 9:44 4/15/2021
Did I answer your question? Mark my post as a solution!
Proud to be a Super User!
My custom visualization projects
Plotting Live Sound: Viz1
Beautiful News:Viz1, Viz2, Viz3
Visual Capitalist: Working Hrs

@smpa01 not exactly so if I have LXSHH-0002 that shows up several times on the 7/29 (within a 24hr period) count that as just 1 rather than each time it shows. and then if it occurs again say the 8/15, then it becomes count of 2

@peteru9067  so LXSHH-0004 should count as 5

 

try this

 

Measure 2 = 
VAR _0 =
    GENERATE (
        'Table',
        DATESBETWEEN ( 'Calendar'[Date], 'Table'[Start], 'Table'[End] )
    )
VAR _1 =
    SUMMARIZE ( _0, 'Table'[IPL Tag], 'Calendar'[Date] )
VAR _3 =
    
        NATURALINNERJOIN ( _1, 'Calendar' )
RETURN COUNTX(_3,[IPL Tag])

 

 

smpa01_0-1638293323976.pngsmpa01_1-1638293350977.png

 

Did I answer your question? Mark my post as a solution!
Proud to be a Super User!
My custom visualization projects
Plotting Live Sound: Viz1
Beautiful News:Viz1, Viz2, Viz3
Visual Capitalist: Working Hrs

how are you arriving at LXSHH-0004 as 5? it should only appear once since it is only listed once in the origial table. I think what we need to do is count the rows and if the start time is still within the 24hr window from the initial start then it should only be counted once.

 

For example 

Lxshh-0002 7/29 1pm

lxshh-0002 7/29 2pm

lxshh-0002 7/29 5pm

lxshh-0004 7/29 2pm

lxshh-0004 7/29 3pm

lxshh-0002 8/14 10am

lxshh-0004 10/15 1pm

lxshh-0004 11/21 10am

 

hence Lxshh-0002 count should only be 2 and 

lxshh-0004 should be 3

@smpa01 , so as you can see in the original table I posted LXSHH-0002 shows up many times however since there are multiple times it shows up in a 24hr duration on different dates the total count should be total of 2 (7/29 & 8/29).......

@peteru9067  can you try this

Measure4 = 
VAR _0 =
    SUMMARIZE ( 'Table', 'Table'[IPL Tag], 'Table'[Start] )
VAR _1 =
    ADDCOLUMNS (
        _0,
        "rank", RANKX ( FILTER ( _0, [IPL Tag] = EARLIER ( [IPL Tag] ) ), [Start],, ASC )
    )
VAR _2 =
    FILTER (
        ADDCOLUMNS (
            _1,
            "diff",
                DATEDIFF (
                    MAXX (
                        FILTER (
                            _1,
                            [IPL Tag] = EARLIER ( [IPL Tag] )
                                && [rank]
                                    = EARLIER ( [rank] ) - 1
                        ),
                        [Start]
                    ),
                    [Start],
                    HOUR
                )
        ),
        [diff]
            == BLANK ()
                || [diff] >= 24
    )
RETURN
    COUNTX ( _2, [IPL Tag] )
Did I answer your question? Mark my post as a solution!
Proud to be a Super User!
My custom visualization projects
Plotting Live Sound: Viz1
Beautiful News:Viz1, Viz2, Viz3
Visual Capitalist: Working Hrs

Exactly.... this did the trick....... thank you so much @smpa01 ....... 

Wow..... Thank you very much..... this was exactly what I was looking for and struggling with for the past 2 days. There is a second portion to this which I will post shortly. 

Again Many many thanks sir

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors