Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Remo_
Advocate III
Advocate III

Direct Query with Snowflake sequential handling of queries

Hi folks,

 

For our customer we have build a Snowflake on Azure datawarehouse. Since the customer has a strict data policy we would like to use the Snowflake dynamic data masking based on user account (https://docs.snowflake.com/en/user-guide/security-column-ddm-use.html). Therefore we have build our PBI datasets with Direct Query and SSO applied to ensure that a user will only retrieve the data that he/she is allowed to see. 

 

This all works fine, but performance of the reports is still an issue. We already tried to play with the following settings in our dataset (as described here https://learn.microsoft.com/en-us/power-bi/create-reports/desktop-evaluation-configuration😞

- Increase maximum number of simultanous evaluations (under parallel loading of tables)

- Increase maximum number of concurrent jobs

- Set active number of data source connections to 30

- Put the dataset in an premium workspace to make sure we can use 30 data source connections 

 

However, when testing queries in our report, we still see that a visual that sends out multiple queries is being handles sequentially instead of parallel. In the below example we have a visual that sends out 13 queries (regardless of whether it is a good idea to have this large number of queries sends to the source by 1 visual). 

 

Dax Studio trace on Dataset in PPU workspace

We see that handling of queries is being done sequentially:

Remo__0-1663668418068.png

I would suspect that having all the correct settings, part of these queries would be handled in parallel. Does anyone know why this is not the case?

 

Snowflake

Looking in Snowlfake we see the same pattern (take notice! this is not the exact same run, so the numbers in Snowflake are not 1 on 1 the same as in DAX studio). Queries do not take to long to anser by Snowflake, but since they are all evaluated sequentially, it takes a long time to get an answer for the visual.

Remo__1-1663668535774.png

 

Horizontal fusion

As of yesterday, microsoft announced a new preview feature: Horizontal Fusion. Which should be able to merge queries and sent out fewer queries to the source. (https://powerbi.microsoft.com/en-us/blog/announcing-horizontal-fusion-a-query-performance-optimizati...). Obviously, this would be great as it would not send out 13 queries (in this example), but maybe only 3 or 4. However, this does not change the fact that these queries shall be handled sequentially. Is there anything we can do to make this happen? Or can the formula engine which compiles all the answers from the underlying queries to 1 output only handle one query at the time?

 

All thoughts are greatly appreciated.

 

Regards,

 

Remo

 

 

1 ACCEPTED SOLUTION
oscarrudnas
Regular Visitor

This article by SQLBI actually describes this issue. 

 

The Formula engine does not support parallelism and will send the queries sequentially. I am not sure why this is the case, though but at least I can start to focus on other problems since this is really dax motor-behaviour.

https://www.sqlbi.com/articles/formula-engine-and-storage-engine-in-dax/

 

Because the formula engine is single-threaded, any operation executed in the formula engine uses just one thread and one core, no matter how many cores are available. The formula engine sends requests to the storage engine sequentially, one query at a time. A certain degree of parallelism is available only within each request to the storage engine, which has a different architecture and can take advantage of the multiple cores available.

View solution in original post

5 REPLIES 5
oscarrudnas
Regular Visitor

This article by SQLBI actually describes this issue. 

 

The Formula engine does not support parallelism and will send the queries sequentially. I am not sure why this is the case, though but at least I can start to focus on other problems since this is really dax motor-behaviour.

https://www.sqlbi.com/articles/formula-engine-and-storage-engine-in-dax/

 

Because the formula engine is single-threaded, any operation executed in the formula engine uses just one thread and one core, no matter how many cores are available. The formula engine sends requests to the storage engine sequentially, one query at a time. A certain degree of parallelism is available only within each request to the storage engine, which has a different architecture and can take advantage of the multiple cores available.

oscarrudnas
Regular Visitor

I have exactly the same issue with Power BI Direct Query against Azure Databricks.

I have contacted Microsoft support, but have not yet got any solution to this. Do you have any update to this topic that could help with the issue?

No I haven't received any support from Microsoft other than the comment below.

v-cgao-msft
Community Support
Community Support

Hi @Remo_ ,

 

Regardless of the improvements, the performance of the data source should always be the primary consideration when using DirectQuery. If the data source is slow, using DirectQuery on that data source will still not be feasible.

 

'Horizontal Fusion' is a great feature, according to the blog, you need to enable preview on desktop and republish to service for it to work, did you do that? If it still doesn't work, it is recommended to comment below the blog to ask PM if the feature supports Snowflake data source. Finally, it's a preview feature, and I recommend that users consider using the preview feature carefully in a production environment.

 

Behavior:

vcgaomsft_2-1663744714530.png

Using DirectQuery in Power BI


Premium capacity enables query caching to speed up report opening:
Query caching in Power BI Premium

 

Best Regards,
Gao
Community Support Team

 

If there is any post helps, then please consider Accept it as the solution to help the other members find it more quickly. If I misunderstand your needs or you still have problems on it, please feel free to let us know. Thanks a lot!

How to get your questions answered quickly -- How to provide sample data

Hi Gao @v-cgao-msft ,

Thanks for you reaction, however this is unfortunately not the answer I was looking for.

 

1. Regardless of whether the data source is fast or slow, I would want to know if there is a way to parrellize the handling of queries by the Power BI engine, because according to the microsoft docs that I refer to above, changing the settings as I did should help.

2. I did change the preview setting to have HorizontalFusion, and published it to a premium workspace, but no luck so far.

3. Regarding query caching, this is only valid for import datasets. My question is about DirectQuery mode.

 

I am interested to know if this feature supports Snowflake. How do I notify the PM about this thread?

 

Regards,

 

Remo

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors