Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Anonymous
Not applicable

Cumulative on a Very Large Dataset with Slicers

Hi there,

 

So, I have been stuck in a cumulative calculation due to the size of my dataset. Here is a sample of the dataset.

 

ID

Users

ActionType

InteractionType

App

7981f340-5e00-49ec-ab55-1cc975cbc5ce

         3,946,783,049

Browse

Right Click

A

b29125a7-a0bc-489c-a739-52c4c3ba1fcf

         1,092,345,660

Browse

Hover

D

be89f247-60f3-455e-b0a7-cd6128bc3f9c

         1,734,750,135

Transaction

Left Click

A

b1c9c5a3-e75c-431e-8b85-bc2d74ba730d

         1,562,373,384

Transaction

Right Click

C

3f9f1dd8-0fda-4c48-94d2-bf779ff95226

         1,107,690,317

Browse

Right Click

B

d0a66a39-3d57-42c5-b2b7-45a54da598a1

             648,011,451

Browse

Right Click

C

c7d7572b-dbb2-4074-985f-4cc44b0a1b51

             230,762,581

Transaction

Left Click

B

a929cead-16ba-49e2-b389-6728af4b5f3d

             228,863,834

Transaction

Left Click

B

08f66143-d1d8-4298-9fbf-ce0eee0e8a62

             216,938,967

Browse

Right Click

D

ef615efa-5892-4b85-90a8-c6718d23e05c

             206,676,175

Transaction

Hover

A

 

My goal is to add Users column cumulatively in descending order (as shown below), while having columns ActionType, InteractionType and App as slicers with multi-select i.e. this calculation becomes dynamic per each slicer combinations.

 

ID

Users

CumulativeUsers

7981f340-5e00-49ec-ab55-1cc975cbc5ce

   3,946,783,049

           3,946,783,049

be89f247-60f3-455e-b0a7-cd6128bc3f9c

   1,734,750,135

           5,681,533,184

b1c9c5a3-e75c-431e-8b85-bc2d74ba730d

   1,562,373,384

           7,243,906,568

3f9f1dd8-0fda-4c48-94d2-bf779ff95226

   1,107,690,317

           8,351,596,885

b29125a7-a0bc-489c-a739-52c4c3ba1fcf

   1,092,345,660

           9,443,942,545

d0a66a39-3d57-42c5-b2b7-45a54da598a1

       648,011,451

         10,091,953,996

c7d7572b-dbb2-4074-985f-4cc44b0a1b51

       230,762,581

         10,322,716,577

a929cead-16ba-49e2-b389-6728af4b5f3d

       228,863,834

         10,551,580,411

08f66143-d1d8-4298-9fbf-ce0eee0e8a62

       216,938,967

         10,768,519,378

ef615efa-5892-4b85-90a8-c6718d23e05c

       206,676,175

         10,975,195,553

 

I have used this DAX measure to calculate CumulativeUsers column above:

 

CumulativeUsers = 
VAR __minUsers = MIN( SampleDataset[Users] )

VAR __cumulativeUsers = CALCULATE( SUM( SampleDataset[Users] ),
                        FILTER( ALLSELECTED(SampleDataset[Users] ), 
                        SampleDataset[Users] >= __minUsers) )

RETURN __cumulativeUsers

 

I am getting the correct answer with this, but the only issue I face is, since I have more than 1 million distinct IDs in my dataset, as soon as I pick a slicer combination (from columns ActionType, InteractionType and App), the report halts and eventually my system crashes due to RAM overflow. It only works for those slicer combinations which have very little data for that particular combination. I have also tried this in systems with total RAM as large as 64GB and got the same response.

 

So, did I do something incorrectly or inefficiently? Or did I go beyond the supported data limits of Power BI here?

 

Your help or suggestions would really be appreciated. Thanks in advance.

1 ACCEPTED SOLUTION
amitchandak
Super User
Super User

@Anonymous , This formula is good. See if there are any bi-directional join. If you can make them single directional. That would help

View solution in original post

2 REPLIES 2
amitchandak
Super User
Super User

@Anonymous , This formula is good. See if there are any bi-directional join. If you can make them single directional. That would help

Anonymous
Not applicable

Hi @amitchandak ,

 

Thank you for responding promptly. I checked all the joins used in our datasource(Power BI Model), as per your suggestion. They were all single directional only to my dismay.

I have tried to limit the data as much as I could over at the SQL view side (while maintaining the dynamicity needed for the current requirement), but its still not good enough to avoid the system crash when using various slicer combinations. 

At this point we are thinking of a parameterized stored procedure approach, that takes care of the cumulative logic based on slicers combinations being sent as parameters. The only issue I could foresee there is the added report response delay and potential loss of multi-select slicer functionality. I read somewhere in these forums that the feature to send mutliple selections per parameter is yet to be added in Power BI.


Please correct me anywhere here or if we can take up a better approach to mitigate this issue.

 

Thank you.  

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.