Inefficient SQL Queries from Power BI (DirectQuery...

Kheranooh · ‎06-13-2022

Our Power BI Report has Direct Query connection to an On-Prem SQL Server. Report has a model created between 3 tables (“Account Rep Mapping“, “Orders Fact”, “Sales Rep Dim”).

Please note, in our Prod environment, “Orders” table has 1 billion rows, “Account Rep” has 150 million rows, “Sales Rep” has 35k rows and “Accounts” has 4 million rows.

Model is created between these tables as below, Orders and Acct_Rep has M*M relationship and cross filtering is set to single direction where Acct_Rep filters Orders.

Our concern is about the inefficient SQL queries that are being generated by Power BI (queries collected using profiler and Power BI Performance Analyzer). These SQL queries have bad impact on query performance and taking too long to run. We find that SQL queries has room for optimization (Aggregation can happen on early stage, rather than after all joins which is very expensive SQL operation). We created aggregate table on Orders to gain better performance and made it 100 million rows, but queries are taking too long, checked the sql statements execution plan in management studio. It’s doing aggregation only after all joins.

Happy to provide more details if required.

--SQL Query (Inefficient)

// Direct Query

SELECT TOP (1000001) *

FROM (

SELECT [semijoin1].[c9]

,SUM([a0]) AS [a0]

FROM (

(

SELECT [t0].[ACCT_ID] AS [c2]

,[t0].[REVN] AS [a0]

FROM (

(

SELECT [$Table].[ORD_NBR] AS [ORD_NBR]

,[$Table].[ACCT_ID] AS [ACCT_ID]

,[$Table].[REVN] AS [REVN]

FROM [dbo].[ORDERS] AS [$Table]

)

) AS [t0]

) AS [basetable0] INNER JOIN (

SELECT [t1].[ACCT_ID] AS [c2]

,[t2].[SLSREP_NM] AS [c9]

FROM (

(

SELECT [$Table].[ACCT_ID] AS [ACCT_ID]

,[$Table].[SLSREP_ID] AS [SLSREP_ID]

FROM [dbo].[ACCT_REP] AS [$Table]

) AS [t1] LEFT OUTER JOIN (

SELECT [$Table].[SLSREP_ID] AS [SLSREP_ID]

,[$Table].[SLSREP_NM] AS [SLSREP_NM]

FROM [dbo].[SLSREP] AS [$Table]

) AS [t2] ON ([t1].[SLSREP_ID] = [t2].[SLSREP_ID])

)

GROUP BY [t1].[ACCT_ID]

,[t2].[SLSREP_NM]

) AS [semijoin1] ON (([semijoin1].[c2] = [basetable0].[c2]))

)

GROUP BY [semijoin1].[c9]

) AS [MainTable]

WHERE (NOT (([a0] IS NULL)))

Kheranooh · ‎06-19-2022

We want Power BI Product/Development team's attension on this. It would bring significant improvement on performance if they could optimize the engine or code whatever is necessary to generate the optimal query instead of inefficient SQL Queries from Power BI (DirectQuery) for M*M relationships.
Or if there is something that we could do at our end to gain the same, would be eager to know.

v-jayw-msft · ‎06-17-2022

Hi @Kheranooh ,

What is your requirement?

Best Regards,

Jay

Community Support Team _ Jay
If this post helps, then please consider Accept it as the solution to help the other members find it.

Inefficient SQL Queries from Power BI (DirectQuery) for M*M relationships

Helpful resources

Microsoft Fabric Learn Together

Power BI Monthly Update - April 2024

Fabric Community Update - April 2024