Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
I am moving our reports from shared capacity to an embedded node, and I'm having trouble exceeding the memory of the node when performing a refresh of the imported data. The data being imported is much less than what this report does successfully in the shared workspace.
Upon investigation, it appears from the diagnostics in the gateway log that the query is not being folded at all - so the full table is being retrieved, and then the filter on date must be handled in the mashup engine. And this happens identically in the shared workspace and the dedicated capacity workspace.
Here's what View Native Query shows for the last step where it is available:
select [$Outer].[RPT_Tests_Summary_PKColumn2] as [RPT_Tests_Summary_PKColumn], [$Outer].[KeyDateTime2] as [KeyDateTime], ..... ..... (three nested queries) ..... [_].[eSigFlagText] from [dbo].[RPT_Tests_Summary] as [_] where [_].[KeyDateTime] > convert(datetime2, '2016-01-01 00:00:00') ) as [_] ) as [$Outer] left outer join [dbo].[RPT_TestsOBD] as [$Inner] on ([$Outer].[RPT_Tests_Summary_PKColumn2] = [$Inner].[RPT_Tests_Summary_PKColumn]) ) as [$Outer] left outer join [dbo].[RPT_Stations] as [$Inner] on ([$Outer].[StationId2] = [$Inner].[StationId])
Only two steps remain in the query after this: a sort on the same column being filtered, and a Rename Columns step. Neither of those should prevent query folding. And even if they do, it is my understanding that the folded query to this point will be sent to SQL Server, and only the remaining two step performed in the mashup engine.
But that's not what is happening. Here is what the Gateway log shows with EmitQueryTrace set to True and advanced logging enabled:
DM.EnterpriseGateway Information: 0 : 2018-07-20T22:00:09.7778877Z DM.EnterpriseGateway ebdabc5c-724c-4430-b17a-01bcfe02b0ad 80705571-8f46-4cd8-8e05-fd246b0578dc MGEQ 80705571-8f46-4cd8-8e05-fd246b0578dc 4A38E118 [DM.GatewayCore] Executing query (timeout=18000) "<pi>SELECT * FROM [RPT_Tests_Summary]</pi>", requestId=null
As you can see, only the driving table is being retrieved, and the date filter is not being applied. (The number and size of the packets retrieved in the async stream lends credence to the idea that it is retrieving all rows.) This happens in the shared workspace as well as the dedicated capacity workspace.
This report retrieves data all the way back to 2014 in the shared workspace, but can't retrieve half of that in the A2 node. I thought I could control how much memory was needed (what size node we should choose) by controlling the date range retrieved, but it's not working because the date filter is not being sent to SQL Server.
I am using the June 2018 version of Desktop as well as the June 2018 version of the gateway.
Can anyone help me figure this out?
Thanks in advance for your help.
Sam
An update: Microsoft is investigating why refreshing this report succeeds in shared capacity but fails (or takes a lot of memory and time) in embedded capacity (an A1, A2 or A3 node).
I've discusses the apparent lack of query folding with them, but that's not the focus of their investigation at this time.
Can anyone give me an idea why View Native Query shows the folded query but the gateway log does not?
I've done some more testing, recreating the report from scratch again, just retriving the driving table and filtering on date - not expanding other tables within the query or anything else. No visuals, just retrieving data, as simple as you can get. View Native Query shows the date comparison and the list of columns to be retrieved, but the gateway logs do not.
So it doesn't have anything to do with complexity of the steps or any post-retrieval steps. Any other ideas?
Thanks again for any help.
Sam
HI @SamTrexler,
>>Can anyone give me an idea why View Native Query shows the folded query but the gateway log does not?
If you are working on sql server datasource, I think profiler should suitable to trace queries.
Regards,
Xiaoxin Sheng
Hi @v-shex-msft, Profiler is a great idea - I should have confirmed the behavior there.
But the gateway log shows something like
SELECT * FROM tab1
instead of the Native Query displayed in the query editor, which would be something like
SELECT tab1.colA, tab2.colB, tab3.colC FROM tab1 INNER JOIN tab2 ON tab2.ForeignKey = tab1.PrimaryKey INNER JOIN tab3 ON tab3.ForeignKey = tab2.PrimaryKey
where tab1.DateCol > '2017-01-01'
So I'm supposing Profiler is just going to show me the same query that the gateway log shows. Which means query folding is not taking place. If that's what's happening, there are probably two more queries being sent across the gateway, and the join is being done in the mashup engine - very expensive on performance.
I definitely want the database engine to restrict the retrieved rows on the date column and join the two other tables, without having to write the native query myself - Power BI does that automatically, and maintains it as I maintain the query. Except it looks like it's not happening.
I will find some time to try this out with Profiler to confirm that the gateway log is not misleading me.
Thanks,
Sam
Hi Sam,
Did you get a resolution to this?
We're facing a similar issue. We have a report that used to fold the query 'fully' for a report joining 1 fact table with 3 dimensions.
(e.g. select from fact join dim1 join dim2 join dim3 where date='jan 1st' and dim1.label = 'xyz')
Later we added (unrelated) dimensions to the model, but the same report now does not fully fold the query. Instead, it ran 3 different queries
(e.g. select from fact join dim1 where dim1.label; select from dim2 join dim3; select from dim3)
Looks like this construct returns millions of rows to -later- be joined/filtered in the mashup engine.
I checked if there are differences between the older and newer versions of the report, but found nothing except the added (unrelated) dimension tables.
Hi @Anonymous
No, there was never really an answer other than trying to wotk through the steps of the query to see what I could do to optimize what was sent to SQL Server.
Since that time, Microsoft has introduced the Performance Analyzer on the Modeling tab of Desktop. That's very helpful. It shows how much time is spent in the query sent to the database, as opposed to time spent in DAX and preparing the visual, and it gives you the query that was actually sent. It also shows you the queries for each visual - if you have several visuals on a page, that may need some optimization, too.
Two tips I've found useful:
#"Expanded DimStations.DimRegions" = Table.ExpandRecordColumn(#"Expanded DimStations", "DimRegions", {"RegionName"}, {"Region Name"}),
Other than that, review the order of steps and how you do them, and see what the query looks like. I've been able to improve the queries sent to SQL Server quite a bit that way.
Hope this helps.
Regards,
Hi @SamTrexler,
According to your description, it seems like your query is to complex and reference too many table so gateway Executing query timeout.
If your issue is caused with node memory exceeding to limit, I haven't found any effective solutions except upgrade license to increase capacity.
BTW, you can also submit a support ticket to get further support.
Regards,
Xiaoxin Sheng
Hi @v-shex-msft, thanks for the reply.
The gateway has no problem executing this query when it is run from a report in Shared Capacity. But it fails when the report is run in Dedicated Capacity - an A2 node. So I don't think the problem is a gateway problem.
An A2 node is supposed to allow 5 GB of cache - more than allowed in Shared Capacity. Yet I am hitting some limit. And Azure shows the memory usage never exceeds 2.7 GB - the same amount it used when it was an A1 node (3 GB cache). I'm just learning how to be a capacity admin, so I may have missed something. Can someone help?
And the reason I am asking about query folding is because I came across it while investigating my basic problem, and it doesn't seem right. It will certainly increase the memory usage to retrieve all the rows, so that may very well alleviate the problem. However, since that "unfolded" query works in Shared Capacity with its 1GB limit, it may not help at all.
I'm following up on every possible cause or explanation I can think of or dig up in research as I try to move us from Shared Capacity to Dedicated Capacity. Any help (other than pointing me to the documentation, which I've already worked through - although I suppose it's possible I may have missed or mis-interpreted some step) would be greatly appreciated.
Today I am going to re-create the report from scratch to see if it's some bug in the report that causes either or both of the symptoms - failing in Dedicated Capacity, and not folding the query. If so, I've got a big but known quantity of work to rebuild all of my reports that link to this one, and any other reports that fail and all of the reports that link to them. I will file a support ticket if that doesn't work and I don't get any tips that help.
Surely changing a workspace to use Dedicated (embedded) Capacity - or loading an existing PBIX file into a new workspace assigned to Dedicated Capacity, I've tried both - shouldn't be this tough. What am I missing?
Thanks for your help,
Sam
Recreating the report made no difference. I created it from scratch, with no other PBI reports in memory or accessed by Desktop since the last reboot - so there couldn't be any artifacts left in the in-memory SSAS. I created just the query and a very simple tablular report - no slicers, no measures, no sort, no other visuals, just the bare bones of the report. On the last step "View Native Query" is enabled, and it shows the date restriction and join being passed in the native query to SQL Server.
Results:
I am going to open a support ticket with Microsoft. However, if anyone has tips for a newbie capacity admin, it would be gratefully appreciated.
Thanks,
Sam
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.