Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Our Power BI Report has Direct Query connection to an On-Prem SQL Server. Report has a model created between 3 tables (“Account Rep Mapping“, “Orders Fact”, “Sales Rep Dim”).
Please note, in our Prod environment, “Orders” table has 1 billion rows, “Account Rep” has 150 million rows, “Sales Rep” has 35k rows and “Accounts” has 4 million rows.
Model is created between these tables as below, Orders and Acct_Rep has M*M relationship and cross filtering is set to single direction where Acct_Rep filters Orders.
Our concern is about the inefficient SQL queries that are being generated by Power BI (queries collected using profiler and Power BI Performance Analyzer). These SQL queries have bad impact on query performance and taking too long to run. We find that SQL queries has room for optimization (Aggregation can happen on early stage, rather than after all joins which is very expensive SQL operation). We created aggregate table on Orders to gain better performance and made it 100 million rows, but queries are taking too long, checked the sql statements execution plan in management studio. It’s doing aggregation only after all joins.
Happy to provide more details if required.
--SQL Query (Inefficient)
// Direct Query
SELECT TOP (1000001) *
FROM (
SELECT [semijoin1].[c9]
,SUM([a0]) AS [a0]
FROM (
(
SELECT [t0].[ACCT_ID] AS [c2]
,[t0].[REVN] AS [a0]
FROM (
(
SELECT [$Table].[ORD_NBR] AS [ORD_NBR]
,[$Table].[ACCT_ID] AS [ACCT_ID]
,[$Table].[REVN] AS [REVN]
FROM [dbo].[ORDERS] AS [$Table]
)
) AS [t0]
) AS [basetable0] INNER JOIN (
SELECT [t1].[ACCT_ID] AS [c2]
,[t2].[SLSREP_NM] AS [c9]
FROM (
(
SELECT [$Table].[ACCT_ID] AS [ACCT_ID]
,[$Table].[SLSREP_ID] AS [SLSREP_ID]
FROM [dbo].[ACCT_REP] AS [$Table]
) AS [t1] LEFT OUTER JOIN (
SELECT [$Table].[SLSREP_ID] AS [SLSREP_ID]
,[$Table].[SLSREP_NM] AS [SLSREP_NM]
FROM [dbo].[SLSREP] AS [$Table]
) AS [t2] ON ([t1].[SLSREP_ID] = [t2].[SLSREP_ID])
)
GROUP BY [t1].[ACCT_ID]
,[t2].[SLSREP_NM]
) AS [semijoin1] ON (([semijoin1].[c2] = [basetable0].[c2]))
)
GROUP BY [semijoin1].[c9]
) AS [MainTable]
WHERE (NOT (([a0] IS NULL)))
We want Power BI Product/Development team's attension on this. It would bring significant improvement on performance if they could optimize the engine or code whatever is necessary to generate the optimal query instead of inefficient SQL Queries from Power BI (DirectQuery) for M*M relationships.
Or if there is something that we could do at our end to gain the same, would be eager to know.
Hi @Kheranooh ,
What is your requirement?
Best Regards,
Jay
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.