Re: How to more efficiently calculate concurrent e...

lance_6 · ‎05-07-2021

I need to calculate the number concurrent events by category for each time period and then the peak/max concurrent events for each category. I was able to do this, however:

The sample file’s base data table is 100 rows from an excel file, my real data is closer to 100,000,000 and connected to a Rest API. When I load in my data it takes forever. When I try to do my calculations everything is slow. I am unable to even save the file after I load in my data. I can’t pre-aggregate because there are connecting tables I need to filter by using slicers.

SO In short, I need a more efficient way of making these calculations. Or advice on what are the next best steps (ex: get better hardware, etc.).

My data looks like this prior to anything:

EventID ApplicationId StartTime EndTime

"A1" "TV" 2020-09-10 13:00:12.000 2020-09-11 07:29:17.000

"A2" "Food" 2020-09-09 01:00:00.000 2020-09-09 16:11:25.000

"A3" "Food" 2020-09-10 04:00:00.000 2020-09-10 06:44:17.000

"A4" "TV" 2020-09-10 10:00:00.000 2020-09-10 18:36:59.000

But for my calculations the only solution I’ve found expands my table to this with M:

EventID ApplicationId Time

"A1" "TV" 2020-09-10 01:00:00.000

"A1" "TV" 2020-09-10 01:01:00.000

"A1" "TV" 2020-09-10 01:02:00.000

"A1" "TV" 2020-09-10 01:03:00.000

"A1" "TV" 2020-09-10 01:04:00.000

OR this with DAX:

EventID ApplicationId StartTime EndTime Time

"A1" "TV" 2020-09-10 13:00:12.000 2020-09-11 07:29:17.000 2020-09-10 13:04:00.000

"A1" "TV" 2020-09-10 13:00:12.000 2020-09-11 07:29:17.000 2020-09-10 13:03:00.000

"A1" "TV" 2020-09-10 13:00:12.000 2020-09-11 07:29:17.000 2020-09-10 13:02:00.000

"A1" "TV" 2020-09-10 13:00:12.000 2020-09-11 07:29:17.000 2020-09-10 13:01:00.000

"A1" "TV" 2020-09-10 13:00:12.000 2020-09-11 07:29:17.000 2020-09-10 13:00:00.000

I then have a combination of 2 measures to calculate #1 number of concurrent events for any given time and #2 The maximum number of concurrent events for the interval determined by a slicer.

[_#category_sessions] =

VAR _categories=DISTINCTCOUNT(myTable_m_expand[EventID])

RETURN

SUMX(DISTINCT(myTable_m_expand[Time]),_categories)

[_#maxConcurrent] =

MAXX(

KEEPFILTERS(VALUES('myTable_m_expand'[Time])),

CALCULATE([_#category_sessions])

)

CURRENT PC SPECS:

I5-8500

16GB RAM

SAMPLE FILE

v-angzheng-msft · ‎05-12-2021

Hi, @lance_6

Sorry I know very little about Rest API, but I would like to know what is your connection mode.

Are you connecting data in import mode? in that case if the data source you are connecting to is too big it will overload your computer, why don't you try to change the connection mode to direct query or live connection, maybe the computer will run more smoothly this way

Best Regards,
Community Support Team _ Zeon Zheng
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

How to get your questions answered quickly -- How to provide sample data