Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Mann
Resolver III
Resolver III

Combining the summary of records from Many to Many tables

Hi Guys,


I need help on this kind of data modelling.

I have two tables capturing Start Time and End Time for location of Individuals. One table captures swipe card details and other captures device location whcih Individual is using. As below:


Card Table:

Ind IDStart DateEnd Date
A8/1/2019 9AM8/1/2019 5PM
C8/2/2019 9AM8/2/2019 5PM
C8/3/2019 8AM8/3/2019 4PM
C8/4/2019 10AM8/4/2019 4PM
C8/5/2019 2PM8/5/2019 10PM
D8/3/2019 8AM8/5/2019 4PM
D8/6/2019 8AM8/6/2019 5PM


Device Table:

Ind IDStart DateEnd Date
B8/1/2019 10AM8/1/2019 8PM
C8/2/2019 10AM8/4/2019 6PM
C8/5/2019 11AM8/5/2019 9PM
D8/3/2019 9AM8/3/2019 4PM
D8/4/2019 8AM8/4/2019 6PM
D8/5/2019 9AM8/5/2019 6PM


These are DAX Calculated tables not the tables coming in Power Query so can't use Power query to combine these tables. I need to create a calculated table which shows the summary of these two as shown:


Summarized Table:

Ind IDStart DateEnd Date
A8/1/2019 9AM8/1/2019 5PM
B8/1/2019 10AM8/1/2019 8PM
C8/2/2019 9AM8/4/2019 6PM
C8/5/2019 11AM8/5/2019 10PM
D8/3/2019 8AM8/5/2019 6PM
D8/6/2019 8AM8/6/2019 5PM


Following are the rules to create a summary table:

1) Bring all individuals in summary tables if not present in any of the table. Therefore, we need to bring "A" from Card table and "B" from Device Table.

2) If some individuals are present in both the tables then summarized record will be shown. For e.g. Ind ID "C", he was detected on 2,3,4 and 5 of Aug in Card Table and on 2 and 5 in Device table. So if we see his presence on site then in summarized table two records will show his two visits. When finding these visits we need to take Min of Start times for both tables and Max of End times for both tables. The threshhold to decide two visits is gap of 8hours, therefore record of 5 Aug for C is not combined with other records.

3) Same rule applies to Ind ID "D".


I want to create this Summary table and then relate it to Card and Device table using relationships also.

Can somebody help me in this.

 

1 ACCEPTED SOLUTION

Thanks Sheng,

 

I achieved this using two calculated tables. I think it can be done using one single table also. Find the code below:

Test.1 = 
VAR _union =
    UNION ( Card, Device )
VAR _index =
    ADDCOLUMNS (
        _union,
        "Index", RANKX ( FILTER ( _union, [Ind ID] = EARLIER ( [Ind ID] ) ), [Start Time],, ASC )
    )
VAR _maxdate =
    FILTER (
        ADDCOLUMNS (
            _index,
            "Max Date", IF (
                [End Time]
                    < MAXX (
                        FILTER (
                            _index,
                            [Ind ID] = EARLIER ( [Ind ID] )
                                && [Index] <= EARLIER ( [Index] )
                        ),
                        [End Time]
                    ),
                0,
                1
            )
        ),
        [Max Date] <> 0
    )
VAR _nextindex =
    ADDCOLUMNS (
        _maxdate,
        "Next Index", RANKX (
            FILTER ( _maxdate, [Ind ID] = EARLIER ( [Ind ID] ) ),
            [Start Time],
            ,
            ASC
        )
    )
VAR _lookup =
    ADDCOLUMNS (
        _nextindex,
        "New Start Date",
        VAR _prevenddate1 =
            MAXX (
                FILTER (
                    _nextindex,
                    [Ind ID] = EARLIER ( [Ind ID] )
                        && [Next Index]
                            = EARLIER ( [Next Index] ) - 1
                ),
                [End Time]
            )
        RETURN
            IF (
                [Start Time] < _prevenddate1
                    && [End Time] > _prevenddate1,
                _prevenddate1,
                [Start Time]
            )
    )
VAR _gapdurationfinal =
    ADDCOLUMNS (
        _lookup,
        "Gap Duration",
        VAR _prevenddate2 =
            MAXX (
                FILTER (
                    _nextindex,
                    [Ind ID] = EARLIER ( [Ind ID] )
                        && [Next Index]
                            = EARLIER ( [Next Index] ) - 1
                ),
                [End Time]
            )
        VAR _gapduration =
            DATEDIFF ( _prevenddate2, [New Start Date], SECOND )
        RETURN
            IF (
                ISBLANK ( DATEDIFF ( _prevenddate2, [New Start Date], SECOND ) ),
                BLANK (),
                _gapduration
            )
    )
VAR _nextduration =
    ADDCOLUMNS (
        _gapdurationfinal,
        "Next Duration", MAXX (
            FILTER (
                _gapdurationfinal,
                [Ind ID] = EARLIER ( [Ind ID] )
                    && [Next Index]
                        = EARLIER ( [Next Index] ) + 1
            ),
            [Gap Duration]
        )
    )
VAR _buffer = 8 * 3600
VAR _tomerge =
    ADDCOLUMNS (
        _nextduration,
        "To Merge", IF (
            [Gap Duration] <= _buffer
                && NOT ( ISBLANK ( [Gap Duration] ) ),
            IF (
                [Next Duration] <= _buffer
                    && NOT ( ISBLANK ( [Next Duration] ) ),
                "X",
                "Y"
            )
        )
    )
RETURN
    SELECTCOLUMNS (
        _tomerge,
        "Ind ID", [Ind ID],
        "Start Time", [New Start Date],
        "End Time", [End Time],
        "Index", [Next Index],
        "Gap Duration Sec", [Gap Duration],
        "ToMerge", [To Merge]
    )

I then created one supporting calculated column in this table as:

NewOutsTime = 
VAR NextToMerge =
    CALCULATE (
        DISTINCT ( 'Test.1'[ToMerge] ),
        'Test.1'[Index]
            = ( EARLIER ( 'Test.1'[Index] ) + 1 ),
        ALLEXCEPT ( 'Test.1',  'Test.1'[Ind ID])
    )
RETURN
    IF (
        NextToMerge IN { "X", "Y" }
            && ISBLANK ( 'Test.1'[ToMerge] ),
        VAR NextY =
            CALCULATE (
                MIN ( 'Test.1'[Index] ),
                'Test.1'[Index] > EARLIER ( 'Test.1'[Index] ),
                'Test.1'[ToMerge] = "Y",
                ALLEXCEPT ( 'Test.1', 'Test.1'[Ind ID] )
            )
        RETURN
            CALCULATE (
                DISTINCT ( 'Test.1'[End Time] ),
                'Test.1'[Index] = NextY,
                ALLEXCEPT ( 'Test.1', 'Test.1'[Ind ID])
            ),
        'Test.1'[End Time]
    )

Then the last calculated table:

Test.2 = 
CALCULATETABLE (
    SELECTCOLUMNS (
        'Test.1',
        "Ind ID", 'Test.1'[Ind ID],
        "Start Date", 'Test.1'[Start Time],
        "End Date", 'Test.1'[NewOutsTime]
    ),
    'Test.1'[ToMerge] = BLANK ()
)

Thanks.

View solution in original post

2 REPLIES 2
v-shex-msft
Community Support
Community Support

Hi @Mann ,

In fact, these are hard to achieve in calculated table based on dax formula.

In addition, you can try to use following calculate table formula to get the summarize table to combining records:

Test = 
VAR _union =
    UNION ( Card, Device )
VAR _temp =
    ADDCOLUMNS (
        _union,
        "SDate", DATEVALUE ( [Start Time] ),
        "EDate", DATEVALUE ( [End Time] )
    )
VAR _model =
    SUMMARIZE ( _temp, [Ind ID], [SDate], [EDate] )
VAR _mutli =
    FILTER ( _model, DATEDIFF ( [SDate], [EDate], DAY ) > 0 )
VAR _single =
    SELECTCOLUMNS (
        FILTER ( _model, DATEDIFF ( [SDate], [EDate], DAY ) = 0 ),
        "ID", [Ind ID],
        "SD", [SDate],
        "ED", [EDate]
    )
RETURN
    ADDCOLUMNS (
        UNION (
            _mutli,
            FILTER (
                _single,
                COUNTROWS (
                    FILTER (
                        _mutli,
                        [Ind ID] = EARLIER ( [ID] )
                            && [SD] IN CALENDAR ( [SDate], [EDate] )
                    )
                ) = 0
            )
        ),
        "STime", MINX (
            FILTER (
                _union,
                [Ind ID] = EARLIER ( [Ind ID] )
                    && DATEVALUE ( [Start Time] ) 
            ),
            TIMEVALUE ( [Start Time] )
        ),
        "ETime", MAXX (
            FILTER (
                _union,
                [Ind ID] = EARLIER ( [Ind ID] )
                    && DATEVALUE ( [End Time] ) = EARLIER ( [EDate] )
            ),
            TIMEVALUE ( [End Time] )
        )
    )

Notice: I still not found methods to deal with multiple date ranges combining.(e.g. combine 8/1/2019,8/3/2019 and 8/2/2019,8/4/2019)

Regards,

Xiaoxin Sheng

Community Support Team _ Xiaoxin
If this post helps, please consider accept as solution to help other members find it more quickly.

Thanks Sheng,

 

I achieved this using two calculated tables. I think it can be done using one single table also. Find the code below:

Test.1 = 
VAR _union =
    UNION ( Card, Device )
VAR _index =
    ADDCOLUMNS (
        _union,
        "Index", RANKX ( FILTER ( _union, [Ind ID] = EARLIER ( [Ind ID] ) ), [Start Time],, ASC )
    )
VAR _maxdate =
    FILTER (
        ADDCOLUMNS (
            _index,
            "Max Date", IF (
                [End Time]
                    < MAXX (
                        FILTER (
                            _index,
                            [Ind ID] = EARLIER ( [Ind ID] )
                                && [Index] <= EARLIER ( [Index] )
                        ),
                        [End Time]
                    ),
                0,
                1
            )
        ),
        [Max Date] <> 0
    )
VAR _nextindex =
    ADDCOLUMNS (
        _maxdate,
        "Next Index", RANKX (
            FILTER ( _maxdate, [Ind ID] = EARLIER ( [Ind ID] ) ),
            [Start Time],
            ,
            ASC
        )
    )
VAR _lookup =
    ADDCOLUMNS (
        _nextindex,
        "New Start Date",
        VAR _prevenddate1 =
            MAXX (
                FILTER (
                    _nextindex,
                    [Ind ID] = EARLIER ( [Ind ID] )
                        && [Next Index]
                            = EARLIER ( [Next Index] ) - 1
                ),
                [End Time]
            )
        RETURN
            IF (
                [Start Time] < _prevenddate1
                    && [End Time] > _prevenddate1,
                _prevenddate1,
                [Start Time]
            )
    )
VAR _gapdurationfinal =
    ADDCOLUMNS (
        _lookup,
        "Gap Duration",
        VAR _prevenddate2 =
            MAXX (
                FILTER (
                    _nextindex,
                    [Ind ID] = EARLIER ( [Ind ID] )
                        && [Next Index]
                            = EARLIER ( [Next Index] ) - 1
                ),
                [End Time]
            )
        VAR _gapduration =
            DATEDIFF ( _prevenddate2, [New Start Date], SECOND )
        RETURN
            IF (
                ISBLANK ( DATEDIFF ( _prevenddate2, [New Start Date], SECOND ) ),
                BLANK (),
                _gapduration
            )
    )
VAR _nextduration =
    ADDCOLUMNS (
        _gapdurationfinal,
        "Next Duration", MAXX (
            FILTER (
                _gapdurationfinal,
                [Ind ID] = EARLIER ( [Ind ID] )
                    && [Next Index]
                        = EARLIER ( [Next Index] ) + 1
            ),
            [Gap Duration]
        )
    )
VAR _buffer = 8 * 3600
VAR _tomerge =
    ADDCOLUMNS (
        _nextduration,
        "To Merge", IF (
            [Gap Duration] <= _buffer
                && NOT ( ISBLANK ( [Gap Duration] ) ),
            IF (
                [Next Duration] <= _buffer
                    && NOT ( ISBLANK ( [Next Duration] ) ),
                "X",
                "Y"
            )
        )
    )
RETURN
    SELECTCOLUMNS (
        _tomerge,
        "Ind ID", [Ind ID],
        "Start Time", [New Start Date],
        "End Time", [End Time],
        "Index", [Next Index],
        "Gap Duration Sec", [Gap Duration],
        "ToMerge", [To Merge]
    )

I then created one supporting calculated column in this table as:

NewOutsTime = 
VAR NextToMerge =
    CALCULATE (
        DISTINCT ( 'Test.1'[ToMerge] ),
        'Test.1'[Index]
            = ( EARLIER ( 'Test.1'[Index] ) + 1 ),
        ALLEXCEPT ( 'Test.1',  'Test.1'[Ind ID])
    )
RETURN
    IF (
        NextToMerge IN { "X", "Y" }
            && ISBLANK ( 'Test.1'[ToMerge] ),
        VAR NextY =
            CALCULATE (
                MIN ( 'Test.1'[Index] ),
                'Test.1'[Index] > EARLIER ( 'Test.1'[Index] ),
                'Test.1'[ToMerge] = "Y",
                ALLEXCEPT ( 'Test.1', 'Test.1'[Ind ID] )
            )
        RETURN
            CALCULATE (
                DISTINCT ( 'Test.1'[End Time] ),
                'Test.1'[Index] = NextY,
                ALLEXCEPT ( 'Test.1', 'Test.1'[Ind ID])
            ),
        'Test.1'[End Time]
    )

Then the last calculated table:

Test.2 = 
CALCULATETABLE (
    SELECTCOLUMNS (
        'Test.1',
        "Ind ID", 'Test.1'[Ind ID],
        "Start Date", 'Test.1'[Start Time],
        "End Date", 'Test.1'[NewOutsTime]
    ),
    'Test.1'[ToMerge] = BLANK ()
)

Thanks.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.