Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
lance_6
Helper II
Helper II

How to more efficiently calculate concurrent event counts

 

I need to calculate the number concurrent events by category for each time period and then the peak/max concurrent events for each category. I was able to do this, however: 

 

The sample file’s base data table is 100 rows from an excel file,  my real data is closer to 100,000,000 and connected to a Rest API. When I load in my data it takes forever. When I try to do my calculations everything is slow. I am unable to even save the file after I load in my data. I can’t pre-aggregate because there are connecting tables I need to filter by using slicers.  

SO In short, I need a more efficient way of making these calculations. Or advice on what are the next best steps (ex: get better hardware, etc.). 

 

 

My data looks like this prior to anything: 

EventID   ApplicationId  StartTime                             EndTime

"A1"       "TV"                  2020-09-10 13:00:12.000    2020-09-11 07:29:17.000

"A2"       "Food"              2020-09-09 01:00:00.000    2020-09-09 16:11:25.000

"A3"       "Food"              2020-09-10 04:00:00.000    2020-09-10 06:44:17.000

"A4"       "TV"                  2020-09-10 10:00:00.000    2020-09-10 18:36:59.000

 

 

But for my calculations the only solution I’ve found expands my table to this with M: 

EventID  ApplicationId   Time

"A1"       "TV"                  2020-09-10 01:00:00.000

"A1"       "TV"                  2020-09-10 01:01:00.000

"A1"       "TV"                  2020-09-10 01:02:00.000

"A1"       "TV"                  2020-09-10 01:03:00.000

"A1"       "TV"                  2020-09-10 01:04:00.000

 

OR this with DAX: 

EventID    ApplicationId    StartTime                               EndTime                              Time

"A1"         "TV"                   2020-09-10 13:00:12.000      2020-09-11 07:29:17.000    2020-09-10 13:04:00.000

"A1"         "TV"                   2020-09-10 13:00:12.000      2020-09-11 07:29:17.000    2020-09-10 13:03:00.000

"A1"         "TV"                   2020-09-10 13:00:12.000      2020-09-11 07:29:17.000    2020-09-10 13:02:00.000

"A1"         "TV"                   2020-09-10 13:00:12.000      2020-09-11 07:29:17.000    2020-09-10 13:01:00.000

"A1"         "TV"                   2020-09-10 13:00:12.000      2020-09-11 07:29:17.000    2020-09-10 13:00:00.000

 

 

I then have a combination of 2 measures to calculate #1 number of concurrent events for any given time and #2 The maximum number of concurrent events for the interval determined by a slicer. 

 

[_#category_sessions]

VAR _categories=DISTINCTCOUNT(myTable_m_expand[EventID])

RETURN

SUMX(DISTINCT(myTable_m_expand[Time]),_categories)



[_#maxConcurrent]

MAXX(

    KEEPFILTERS(VALUES('myTable_m_expand'[Time])),

    CALCULATE([_#category_sessions])

)

 

CURRENT PC SPECS: 

I5-8500

16GB RAM

 

 

SAMPLE FILE 

4 REPLIES 4
v-angzheng-msft
Community Support
Community Support

Hi, @lance_6 


Sorry I know very little about Rest API,  but I would like to know what is your connection mode.

Are you connecting data in import mode?  in that case if the data source you are connecting to is too big it will overload your computer, why don't you try to change the connection mode to direct query or live connection, maybe the computer will run more smoothly this way

 

 

Best Regards,
Community Support Team _ Zeon Zheng
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

davehus
Memorable Member
Memorable Member

Hi, The first suggestion I'd make is first try and change this line of code VAR _categories=DISTINCTCOUNT(myTable_m_expand[EventID])

 

To VAR _categories=SUMX(DISTINCT(myTable_m_expand[EventID]),1)

 

Distinct Count is memory intensive where you have high cardinality in a column.

 

If you have experience with Dax Studio, it is very useful for checking measure performance within your model.

 

Hope this helps.

 

 

 

So I tried this with my actuall data model and it crashed it due to high ram consumption where as COUNDISTINCT is able to run. 

Thank you, I'll give this a try and see what it does to perfomance. 

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.