Showing results for 
Search instead for 
Did you mean: 
Frequent Visitor

Query folding creates unusable queries in merge

We use Snowflake as a data source, which (unofficially?) supports query folding. This normally works, but when joining large tables via "Merge Queries" we keep running into the issue that the resulting query will either time-out or takes too much time to be usable.


So when looking at the queries generated by Power BI, there are these odd statements attached to the ON clauses which seem to be the cause of the problem:



    ) as "ITBL" on (
          ("OTBL"."D_X_KEY" = "ITBL"."D_X_KEY" and not "OTBL"."D_X_KEY" is null)
          and not "ITBL"."D_X_KEY" is null
          or "OTBL"."D_X_KEY" is null
          and "ITBL"."D_X_KEY" is null
) as "ITBL" on (
      ("OTBL"."D_Y_KEY" = "ITBL"."D_Y_KEY" and not "OTBL"."D_Y_KEY" is null)
      and not "ITBL"."D_Y_KEY" is null
      or "OTBL"."D_Y_KEY" is null
      and "ITBL"."D_Y_KEY" is null



Leaving the "is null" clauses out, it will run fine and within seconds:



    ) as "ITBL" on "OTBL"."D_X_KEY" = "ITBL"."D_X_KEY"
) as "ITBL" on "OTBL"."D_Y_KEY" = "ITBL"."D_Y_KEY"



Has anyone seen this behaviour? Is there anything can be done to prevent this, without resorting to native queries? 

Helper V
Helper V

It is not uncommon for Power BI queries to experience performance issues when joining large tables. The additional clauses added to the ON statements in your query are causing the issue, and removing them has improved performance. These clauses are used to handle null values, but they can be problematic in certain situations.

One solution to improve performance is to use native SQL queries instead of relying on the Power Query Editor. This would give you greater control over the query and allow you to optimize it for performance.

If you prefer to continue using the Power Query Editor, there are some potential workarounds. For example, you could try breaking down the query into smaller, more manageable steps or use other techniques such as filtering or sorting the data to improve query performance. Additionally, you could try modifying the Snowflake settings to optimize query performance, such as increasing the warehouse size or adjusting the concurrency level.

Ultimately, the best approach may depend on the specific needs and constraints of your project. It may be helpful to consult with a Snowflake expert or a Power BI consultant to find the best solution for your particular situation.

Super User
Super User

when joining large tables via "Merge Queries"

Try to avoid using merges/joins in Power Query as much as possible. They are very expensive in a couple of areas.  Instead, let the data model do the work for you  (unless you have to use fuzzy joins in which case you're doomed. I mean in which case you need to think hard about cardinality and order of joins.)

Thanks for your reply. We do try to avoid them as much as possible, however currently there are situations for us where (IMHO) there's no alternative.


Also I assumed that it's actually less expensive (on the Power BI side)  to do it this way, since the amount of data that is imported is sized down beforehand?

Before merging you should reduce the column count to the smallest viable amount. I believe the data has to be sized down. You could try to unpivot other columns that are not the merge column. I have always tried to Remove columns when it's possible (and they are not needed).

Helpful resources

Winner of T-Shirt Design

Power BI T-Shirt Design Challenge 2023

Find out who won the T-Shirt Design Challenge and the top 3 finalists.

March 2023 Update3

Power BI March 2023 Update

Find out more about the March 2023 update.

Top Kudoed Authors