cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
lance_6
Helper II
Helper II

How to more efficiently calculate concurrent event counts

 

I need to calculate the number concurrent events by category for each time period and then the peak/max concurrent events for each category. I was able to do this, however: 

 

The sample file’s base data table is 100 rows from an excel file,  my real data is closer to 100,000,000 and connected to a Rest API. When I load in my data it takes forever. When I try to do my calculations everything is slow. I am unable to even save the file after I load in my data. I can’t pre-aggregate because there are connecting tables I need to filter by using slicers.  

SO In short, I need a more efficient way of making these calculations. Or advice on what are the next best steps (ex: get better hardware, etc.). 

 

 

My data looks like this prior to anything: 

EventID   ApplicationId  StartTime                             EndTime

"A1"       "TV"                  2020-09-10 13:00:12.000    2020-09-11 07:29:17.000

"A2"       "Food"              2020-09-09 01:00:00.000    2020-09-09 16:11:25.000

"A3"       "Food"              2020-09-10 04:00:00.000    2020-09-10 06:44:17.000

"A4"       "TV"                  2020-09-10 10:00:00.000    2020-09-10 18:36:59.000

 

 

But for my calculations the only solution I’ve found expands my table to this with M: 

EventID  ApplicationId   Time

"A1"       "TV"                  2020-09-10 01:00:00.000

"A1"       "TV"                  2020-09-10 01:01:00.000

"A1"       "TV"                  2020-09-10 01:02:00.000

"A1"       "TV"                  2020-09-10 01:03:00.000

"A1"       "TV"                  2020-09-10 01:04:00.000

 

OR this with DAX: 

EventID    ApplicationId    StartTime                               EndTime                              Time

"A1"         "TV"                   2020-09-10 13:00:12.000      2020-09-11 07:29:17.000    2020-09-10 13:04:00.000

"A1"         "TV"                   2020-09-10 13:00:12.000      2020-09-11 07:29:17.000    2020-09-10 13:03:00.000

"A1"         "TV"                   2020-09-10 13:00:12.000      2020-09-11 07:29:17.000    2020-09-10 13:02:00.000

"A1"         "TV"                   2020-09-10 13:00:12.000      2020-09-11 07:29:17.000    2020-09-10 13:01:00.000

"A1"         "TV"                   2020-09-10 13:00:12.000      2020-09-11 07:29:17.000    2020-09-10 13:00:00.000

 

 

I then have a combination of 2 measures to calculate #1 number of concurrent events for any given time and #2 The maximum number of concurrent events for the interval determined by a slicer. 

 

[_#category_sessions]

VAR _categories=DISTINCTCOUNT(myTable_m_expand[EventID])

RETURN

SUMX(DISTINCT(myTable_m_expand[Time]),_categories)



[_#maxConcurrent]

MAXX(

    KEEPFILTERS(VALUES('myTable_m_expand'[Time])),

    CALCULATE([_#category_sessions])

)

 

CURRENT PC SPECS: 

I5-8500

16GB RAM

 

 

SAMPLE FILE 

4 REPLIES 4
v-angzheng-msft
Community Support
Community Support

Hi, @lance_6 


Sorry I know very little about Rest API,  but I would like to know what is your connection mode.

Are you connecting data in import mode?  in that case if the data source you are connecting to is too big it will overload your computer, why don't you try to change the connection mode to direct query or live connection, maybe the computer will run more smoothly this way

 

 

Best Regards,
Community Support Team _ Zeon Zheng
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

davehus
Resolver II
Resolver II

Hi, The first suggestion I'd make is first try and change this line of code VAR _categories=DISTINCTCOUNT(myTable_m_expand[EventID])

 

To VAR _categories=SUMX(DISTINCT(myTable_m_expand[EventID]),1)

 

Distinct Count is memory intensive where you have high cardinality in a column.

 

If you have experience with Dax Studio, it is very useful for checking measure performance within your model.

 

Hope this helps.

 

 

 

So I tried this with my actuall data model and it crashed it due to high ram consumption where as COUNDISTINCT is able to run. 

Thank you, I'll give this a try and see what it does to perfomance. 

Helpful resources

Announcements
UG GA Amplification 768x460.png

Launching new user group features

Learn how to create your own user groups today!

November Power BI Update 768x460.png

Check it Out!

Click here to read more about the November 2021 Updates!

M365 768x460.jpg

Microsoft 365 Collaboration Conference | December 7–9, 2021

Join us, in-person, December 7–9 in Las Vegas, for the largest gathering of the Microsoft community in the world.