Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
SamTrexler
Helper IV
Helper IV

Executed query is not folded like View Native Query shows, preventing successful refresh

I am moving our reports from shared capacity to an embedded node, and I'm having trouble exceeding the memory of the node when performing a refresh of the imported data. The data being imported is much less than what this report does successfully in the shared workspace.

 

Upon investigation, it appears from the diagnostics in the gateway log that the query is not being folded at all - so the full table is being retrieved, and then the filter on date must be handled in the mashup engine. And this happens identically in the shared workspace and the dedicated capacity workspace.

 

Here's what View Native Query shows for the last step where it is available:

 

select [$Outer].[RPT_Tests_Summary_PKColumn2] as [RPT_Tests_Summary_PKColumn],
    [$Outer].[KeyDateTime2] as [KeyDateTime],
..... 
..... (three nested queries)
.....
                [_].[eSigFlagText]
            from [dbo].[RPT_Tests_Summary] as [_]
            where [_].[KeyDateTime] > convert(datetime2, '2016-01-01 00:00:00')
        ) as [_]
    ) as [$Outer]
    left outer join [dbo].[RPT_TestsOBD] as [$Inner] on ([$Outer].[RPT_Tests_Summary_PKColumn2] = [$Inner].[RPT_Tests_Summary_PKColumn])
) as [$Outer]
left outer join [dbo].[RPT_Stations] as [$Inner] on ([$Outer].[StationId2] = [$Inner].[StationId])

Only two steps remain in the query after this: a sort on the same column being filtered, and a Rename Columns step. Neither of those should prevent query folding. And even if they do, it is my understanding that the folded query to this point will be sent to SQL Server, and only the remaining two step performed in the mashup engine.

 

But that's not what is happening. Here is what the Gateway log shows with EmitQueryTrace set to True and advanced logging enabled:

 

DM.EnterpriseGateway Information: 0 : 2018-07-20T22:00:09.7778877Z DM.EnterpriseGateway	ebdabc5c-724c-4430-b17a-01bcfe02b0ad	80705571-8f46-4cd8-8e05-fd246b0578dc	MGEQ	80705571-8f46-4cd8-8e05-fd246b0578dc	4A38E118 [DM.GatewayCore] Executing query (timeout=18000) "<pi>SELECT * FROM [RPT_Tests_Summary]</pi>", requestId=null

As you can see, only the driving table is being retrieved, and the date filter is not being applied. (The number and size of the packets retrieved in the async stream lends credence to the idea that it is retrieving all rows.) This happens in the shared workspace as well as the dedicated capacity workspace.

 

This report retrieves data all the way back to 2014 in the shared workspace, but can't retrieve half of that in the A2 node. I thought I could control how much memory was needed (what size node we should choose) by controlling the date range retrieved, but it's not working because the date filter is not being sent to SQL Server.

 

I am using the June 2018 version of Desktop as well as the June 2018 version of the gateway.

 

Can anyone help me figure this out?

 

Thanks in advance for your help.

 

Sam

8 REPLIES 8
SamTrexler
Helper IV
Helper IV

An update: Microsoft is investigating why refreshing this report succeeds in shared capacity but fails (or takes a lot of memory and time) in embedded capacity (an A1, A2 or A3 node).

 

I've discusses the apparent lack of query folding with them, but that's not the focus of their investigation at this time.

 

Can anyone give me an idea why View Native Query shows the folded query but the gateway log does not?

 

I've done some more testing, recreating the report from scratch again, just retriving the driving table and filtering on date - not expanding other tables within the query or anything else. No visuals, just retrieving data, as simple as you can get. View Native Query shows the date comparison and the list of columns to be retrieved, but the gateway logs do not.

 

So it doesn't have anything to do with complexity of the steps or any post-retrieval steps. Any other ideas?

 

Thanks again for any help.

 

Sam

HI @SamTrexler,

 

>>Can anyone give me an idea why View Native Query shows the folded query but the gateway log does not?

If you are working on sql server datasource, I think profiler should suitable to trace queries.

SQL Server Profiler

 

Regards,

Xiaoxin Sheng

Community Support Team _ Xiaoxin
If this post helps, please consider accept as solution to help other members find it more quickly.

Hi @v-shex-msft, Profiler is a great idea - I should have confirmed the behavior there.

 

But the gateway log shows something like

SELECT * FROM tab1

 

instead of the Native Query displayed in the query editor, which would be something like

SELECT tab1.colA, tab2.colB, tab3.colC
FROM tab1
INNER JOIN tab2 ON tab2.ForeignKey = tab1.PrimaryKey
INNER JOIN tab3 ON tab3.ForeignKey = tab2.PrimaryKey
where tab1.DateCol > '2017-01-01'

So I'm supposing Profiler is just going to show me the same query that the gateway log shows. Which means query folding is not taking place. If that's what's happening, there are probably two more queries being sent across the gateway, and the join is being done in the mashup engine - very expensive on performance.

 

I definitely want the database engine to restrict the retrieved rows on the date column and join the two other tables, without having to write the native query myself - Power BI does that automatically, and maintains it as I maintain the query. Except it looks like it's not happening.

 

I will find some time to try this out with Profiler to confirm that the gateway log is not misleading me.

 

Thanks,

 

Sam

Anonymous
Not applicable

Hi Sam,

 

Did you get a resolution to this?

We're facing a similar issue. We have a report that used to fold the query 'fully' for a report joining 1 fact table with 3 dimensions.

(e.g. select from fact join dim1 join dim2 join dim3 where date='jan 1st' and dim1.label = 'xyz')

 

Later we added (unrelated) dimensions to the model, but the same report now does not fully fold the query. Instead, it ran 3 different queries

(e.g. select from fact join dim1 where dim1.label; select from dim2 join dim3; select from dim3)
Looks like this construct returns millions of rows to -later- be joined/filtered in the mashup engine.

 

I checked if there are differences between the older and newer versions of the report, but found nothing except the added (unrelated) dimension tables.

Hi @Anonymous 

 

No, there was never really an answer other than trying to wotk through the steps of the query to see what I could do to optimize what was sent to SQL Server. 

 

Since that time, Microsoft has introduced the Performance Analyzer on the Modeling tab of Desktop. That's very helpful. It shows how much time is spent in the query sent to the database, as opposed to time spent in DAX and preparing the visual, and it gives you the query that was actually sent. It also shows you the queries for each visual - if you have several visuals on a page, that may need some optimization, too.

 

Two tips I've found useful:

 

  • When you expand a related table (e.g., a dimension), Query Editor doesn't give you the opportunity to name the resulting columns - other than whether or not to prefix it with the query name. If you want some other name, you can add a step to rename the column after you expand it. But that can create a very inefficient nested query to SQL Server. Instead, I go into Advanced Editor and rename the column in the output, which creates a much better SQL query, e.g.,
#"Expanded DimStations.DimRegions" = Table.ExpandRecordColumn(#"Expanded DimStations", "DimRegions", {"RegionName"}, {"Region Name"}),
  •  Be careful how many filters you create in the query itself (i.e., clicking on a column in Query Editor and filtering it there). Query Editor doesn't always handle that as cleanly as you might like - I had filters applied at several levels on deeply nested queries. Simplifying that got me down to one "where" clause in SQL at the outer level of the query - which SQL Server handles nicely.

Other than that, review the order of steps and how you do them, and see what the query looks like. I've been able to improve the queries sent to SQL Server quite a bit that way.

 

Hope this helps.

 

Regards,

 

v-shex-msft
Community Support
Community Support

Hi @SamTrexler,

 

According to your description, it seems like your query is to complex and reference too many table so gateway Executing query timeout.

 

If your issue is caused with node memory exceeding to limit, I haven't found any effective solutions except upgrade license to increase capacity. 

 

BTW, you can also submit a support ticket to get further support.

submit a support ticketsubmit a support ticket

 

Regards,

Xiaoxin Sheng

Community Support Team _ Xiaoxin
If this post helps, please consider accept as solution to help other members find it more quickly.

Hi @v-shex-msft, thanks for the reply.

 

The gateway has no problem executing this query when it is run from a report in Shared Capacity.  But it fails when the report is run in Dedicated Capacity - an A2 node. So I don't think the problem is a gateway problem.

 

An A2 node is supposed to allow 5 GB of cache - more than allowed in Shared Capacity. Yet I am hitting some limit. And Azure shows the memory usage never exceeds 2.7 GB - the same amount it used when it was an A1 node (3 GB cache). I'm just learning how to be a capacity admin, so I may have missed something. Can someone help?

 

And the reason I am asking about query folding is because I came across it while investigating my basic problem, and it doesn't seem right. It will certainly increase the memory usage to retrieve all the rows, so that may very well alleviate the problem. However, since that "unfolded" query works in Shared Capacity with its 1GB limit, it may not help at all.

 

I'm following up on every possible cause or explanation I can think of or dig up in research as I try to move us from Shared Capacity to Dedicated Capacity. Any help (other than pointing me to the documentation, which I've already worked through - although I suppose it's possible I may have missed or mis-interpreted some step) would be greatly appreciated.

 

Today I am going to re-create the report from scratch to see if it's some bug in the report that causes either or both of the symptoms - failing in Dedicated Capacity, and not folding the query. If so, I've got a big but known quantity of work to rebuild all of my reports that link to this one, and any other reports that fail and all of the reports that link to them. I will file a support ticket if that doesn't work and I don't get any tips that help.

 

Surely changing a workspace to use Dedicated (embedded) Capacity - or loading an existing PBIX file into a new workspace assigned to Dedicated Capacity, I've tried both - shouldn't be this tough. What am I missing?

 

Thanks for your help,

 

Sam

 

Recreating the report made no difference. I created it from scratch, with no other PBI reports in memory or accessed by Desktop since the last reboot - so there couldn't be any artifacts left in the in-memory SSAS. I created just the query and a very simple tablular report - no slicers, no measures, no sort, no other visuals, just the bare bones of the report. On the last step "View Native Query" is enabled, and it shows the date restriction and join being passed in the native query to SQL Server.

 

Results:

  1. The new report will not publish successfully to the dedicated capactiry (A1 node). It fails with "An error occurred while attempting to publish 'Data Lookup - Test Summary - Rebuild.pbix': The remote server returned an error: (404) Not Found."
  2. In the workspace in shared capacity, it publishes just fine, and runs successfully. However, it is stil performing a "SELECT * FROM RPT_Tests_Summary", according to the gateway diagnostic logs.

I am going to open a support ticket with Microsoft. However, if anyone has tips for a newbie capacity admin, it would be gratefully appreciated.

 

Thanks,

 

Sam

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors