Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Grow your Fabric skills and prepare for the DP-600 certification exam by completing the latest Microsoft Fabric challenge.

Reply
Eliza1994
Regular Visitor

Questions about Storage Capability on Automatic Aggregations for DirectQuery

Hi Community,

 

I am creating reports over a large dataset and access data with DirectQuery. There is a latency when refresh reports and my client is considering whether to use automatic aggregation to optimize report performance.
Now there are a few questions about the capacity of the PowerBI service storage when using automatic aggregation.

 

Licence: Power BI Premium


(1) Will the use of automatic aggregation cause the cache to increase continually and cause insufficient memory ?
(2) If the number of reports is large and the usage time is long enough, will the cache affect the performance of the reports?

 

Any advice/information would be helpful.

2 REPLIES 2
Eliza1994
Regular Visitor

Hi, Charlotte @v-zhangti 

 

Thanks for your answer and I very appreciate for the detailed information you provided.

I still have some confusion about how the auto aggregation works. Would you like to help me to understand more about how the cache will change when I enable auto aggregation?

 

Question 1:  Where the cache will be stored?

In my understanding, the cache is supposed to store in the workspace storage, which has a limit of 100TB per Premium capacity. Each cache will be created by each dataset, if the auto aggregation enabled.

 

Question2:  About the query log

I noticed that for each dataset, Power BI maintains seven days of query log data, and the training operations is based on the 7-day of query log. How will the cache change if I haven’t use one report more than 1 month?  There is no log for the report for recent 7 days. Will the cache be deleted by then?

 

Question3: About the cache growth

Thanks for you mentioned that “Using automatic aggregation also does not cause the cache to keep growing”. It is the very thing that I care about. Would you mind explain some details to help me understand?

I am curiosity of the cache growth because my client wonder if the cache keeps growing so that one day the cache will be full, as both data size in backend data source and the number of reports keeps growing.

In this occasion, what will happen to reports in workspace? Should I purchase more storage space for cache by then?

 

Looking forward to your reply,

Bestwishes,

Eliza

v-zhangti
Community Support
Community Support

Hi, @Eliza1994 

 

Power BI Premium also supports Query caching in Power BI Premium/Embedded to maintain query results. Query caching is a different feature from automatic aggregations. With query caching, Power BI Premium uses its local caching service to implement caching, whereas automatic aggregations are implemented at the dataset level. With query caching, the service only caches queries for the initial report page load, therefore query performance isn't improved when users interact with a report. In contrast, automatic aggregations optimize most report queries by pre-caching aggregated query results, including those queries generated when users interact with reports. Query caching and automatic aggregations can both be enabled for a dataset, but it's likely not necessary. Using automatic aggregation also does not cause the cache to keep growing.

 

When enabled for a DirectQuery dataset, automatic aggregations can boost report query performance by avoiding data source query round trips. Pre-aggregated query results are automatically returned by an in-memory aggregations cache rather than being sent to and returned by the data source. The amount of pre-aggregated data in the in-memory aggregations cache is a small fraction of the amount of data kept in fact and detail tables at the data source. The result is not only better report query performance, but also reduced load on backend data source systems. With automatic aggregations, only a small portion of report and ad-hoc queries that require aggregations not included in the in-memory cache are passed to the backend data source, just like with pure DirectQuery mode.

 

Training and refresh operations are process and resource intensive for both the Power BI service and the data source systems. Increasing the percentage of queries that use aggregations means more aggregations must be queried and calculated from data sources during training and refresh operations, increasing the probability of excessive use of system resources and potentially causing timeouts. To learn more, see Fine tuning.

 

For more information, please refer to the following links.

https://docs.microsoft.com/en-us/power-bi/enterprise/aggregations-auto 

https://docs.microsoft.com/en-us/power-bi/enterprise/aggregations-auto-configure#fine-tuning 

 

Best Regards,

Community Support Team _Charlotte

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Helpful resources

Announcements
RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

MayPowerBICarousel1

Power BI Monthly Update - May 2024

Check out the May 2024 Power BI update to learn about new features.

Top Solution Authors