Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
RoastedPork
Frequent Visitor

Optimizing the datasource CPU usage

Hi all, 

 

I am looking for workarounds on how to optimize the CPU usage in the datasource (SQL server).  Currently we have a lot queries on the dataflow and they all have incremental refresh setup with them. 

 

The dataflow refresh hisotry shows a lot of queries been thrown to the datasource in a short amount of time each refresh. I guess those concurrent queries would be one of the reasons causing high CPU usage in our datasource?  We've been charged on excessive CPU usage on the SQL server. 

 

Apart from optimising query complexity, just wondering are there any advice on reducing the CPU usage in the datasource? Thanks a lot!  

2 ACCEPTED SOLUTIONS
Burningsuit
Resident Rockstar
Resident Rockstar

v-jayw-msft
Community Support
Community Support

Hi @RoastedPork ,

 

You can optimize your data model using following tips:

 

Remove unused tables or columns, where possible.
Avoid distinct counts on fields with high cardinality – that is, millions of distinct values.
Take steps to avoid fields with unnecessary precision and high cardinality. For example, you could split highly unique datetime values into separate columns – for example, month, year, date, and so on. Or, where possible, use rounding on high-precision fields to lower cardinality – (for example, 13.29889 -> 13.3).
Use integers instead of strings, where possible.
Be wary of DAX functions, which need to test every row in a table – for example, RANKX – in the worst case, these functions can exponentially increase run-time and memory requirements given linear increases in table size.
When connecting to data sources via DirectQuery, consider indexing columns that are commonly filtered or sliced again. Indexing greatly improves report responsiveness.
Enable Row-Level Security (RLS) where applicable.
Use Microsoft AppSource certified custom visuals where applicable.
Do not use hierarchical filters.
Provide data categorization for Power BI reports (HBI, MBI, LBI).
Use the On-premises data gateway instead of Personal Gateway.
Use slicers sparingly.

 

To minimize the impact of network latency, strive to keep data sources, gateways, and your Power BI cluster as close as possible. If network latency is an issue, try locating gateways and data sources closer to your Power BI cluster by placing them on virtual machines.

 

To further improve network latency, consider using Azure ExpressRoute, which is able of creating faster, more reliable network connections between your clients and Azure datacenters.

 

For more details you could take a look at this official document.

https://docs.microsoft.com/en-us/power-bi/transform-model/dataflows/dataflows-understand-optimize-re... 

 

Best Regards,

Jay

Community Support Team _ Jay
If this post helps, then please consider Accept it as the solution
to help the other members find it.

View solution in original post

3 REPLIES 3
v-jayw-msft
Community Support
Community Support

Hi @RoastedPork ,

 

You can optimize your data model using following tips:

 

Remove unused tables or columns, where possible.
Avoid distinct counts on fields with high cardinality – that is, millions of distinct values.
Take steps to avoid fields with unnecessary precision and high cardinality. For example, you could split highly unique datetime values into separate columns – for example, month, year, date, and so on. Or, where possible, use rounding on high-precision fields to lower cardinality – (for example, 13.29889 -> 13.3).
Use integers instead of strings, where possible.
Be wary of DAX functions, which need to test every row in a table – for example, RANKX – in the worst case, these functions can exponentially increase run-time and memory requirements given linear increases in table size.
When connecting to data sources via DirectQuery, consider indexing columns that are commonly filtered or sliced again. Indexing greatly improves report responsiveness.
Enable Row-Level Security (RLS) where applicable.
Use Microsoft AppSource certified custom visuals where applicable.
Do not use hierarchical filters.
Provide data categorization for Power BI reports (HBI, MBI, LBI).
Use the On-premises data gateway instead of Personal Gateway.
Use slicers sparingly.

 

To minimize the impact of network latency, strive to keep data sources, gateways, and your Power BI cluster as close as possible. If network latency is an issue, try locating gateways and data sources closer to your Power BI cluster by placing them on virtual machines.

 

To further improve network latency, consider using Azure ExpressRoute, which is able of creating faster, more reliable network connections between your clients and Azure datacenters.

 

For more details you could take a look at this official document.

https://docs.microsoft.com/en-us/power-bi/transform-model/dataflows/dataflows-understand-optimize-re... 

 

Best Regards,

Jay

Community Support Team _ Jay
If this post helps, then please consider Accept it as the solution
to help the other members find it.

Is there a workaround for distinctcount? I need it for one of my metrics but it is taking way too much CPU.

Burningsuit
Resident Rockstar
Resident Rockstar

Hi @RoastedPork 

You may find this blog post from Chris Webb useful

Chris Webb's BI Blog: Why You Should Optimise Your Power BI Premium Reports And Refreshes For CPU Ti...

Similarly this is used in a "Guy in a Cube" video

(18) When optimizing Power BI don't forgot about CPU - YouTube

Hope this helps

Stuart

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors