Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
DouweMeer
Post Prodigy
Post Prodigy

Optimize Service memory usage

So I have a report which has an incremental refresh on a Salesforce source. This Desktop report is published to Service which, fully loaded is approximaly 9GB on Service. This report is (currently) refreshed every 4 hours. I'm working on some other report to put on there as well in the excess of 7GB with a planned refresh of 4 hours and as the source an SQL database. 

 

Now the discussion I'm having with our Power BI Governance team is that we at corporate level are hitting the memory capacitiy of our node of 50 GB. We're having a discussion on slimming down reports but we struggle to define the memory usage in the first place. My argument is that when I would change my report from Import to Direct Query, it would in fact use much more memory as the table will be stored in Cache (Primary Storage aka Memory), while with Import it would be stored in storage (aka secondary storage). My argument would say that the current limitation is on the Memory and not the (secondary) Storage, thus having the report as Import is a better way to publish the report than using Direct Query. 

 

Everyone is making loads of assumptions, including myself on Memory usage and how storage would work on Service. Also the monitor dashboard created by Microsoft used by our Admins is limited in that the Memory usage can't be filtered to the report/ workspace.

 

One says Direct Query would be the Memory saver while the other (myself) says the import is the Memory saver. What is true?

1 ACCEPTED SOLUTION
JirkaZ
Solution Specialist
Solution Specialist

@DouweMeer Exactly. And btw. the book "The definitive guide to DAX" explains the engine behind Power BI really well.

View solution in original post

8 REPLIES 8
JirkaZ
Solution Specialist
Solution Specialist

@DouweMeer So when you are using Import mode for your tables, whenever the report is used, the dataset is loaded into the capacity's memory. When the refresh happens, roughly the dataset size x2 is reserved in the memory. 

On the other hand when using DirectQuery, only the materialized results are cached in the memory. 

@JirkaZ 

We did some tests by uploading a second report non 'used' version of the same. During report usages and refresh we didn't saw a behavior of what you're suggesting. Is there any documentation on the basis what you're referring to?

Note for the Direct Query would be that you would put stress on the server capacity by making requests each time it is used. We don't know how both would size up when compared to each other. 

@JirkaZ 

Thank you for your links. However when reading through them, I don't see something mentioning that when you import data: "So when you are using Import mode for your tables, whenever the report is used, the dataset is loaded into the capacity's memory".

It only mentions that when you publish a report, the compression ratio's vary when available in Service. That during the publish/ refresh state the VertiPaq storage engine is deals with an increased memory usage, but nothing is mentioned when an end user requests a dashboard of the data model. 

JirkaZ
Solution Specialist
Solution Specialist

@DouweMeer It's by definition. Power BI is SSAS Tabular in memory. So whenever you want to retrieve data, it's from memory. So when you want to view a report, the model is loaded from storage into memory and only then you can view it. 

@JirkaZ 

That sounds worrying. So if I have the statement:

calculate ( countrows( selectcolumns ( A , "test" , A.A ) ) , userelationship ( A.B , B.A ) )

You're suggesting the whole table created after an inner join of A and B is stored in RAM during the time the user is accessing the dashboard?

JirkaZ
Solution Specialist
Solution Specialist

@DouweMeer Exactly. And btw. the book "The definitive guide to DAX" explains the engine behind Power BI really well.

@JirkaZ 

I thought the process would be that A and B are pulled into memory to generate the selectcolumns statement. After this is done, the selectcolumns would be stored in cache for further calculation instead of whole of A and B. 

What I noticed in the past is when you do not use selectcolumns as an intermediair, the calculation would take enourmous amount of times if the table you're calculating on has a relationship with another large table. But maybe that's because I didn't used calculate around selectedcolumns...

Sounds like I have to dive deep in the VertiPaq storage engine :). 

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors