Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
Anonymous
Not applicable

Power BI loads all the rows before loading the queried rows only

I have a 5 million rows timeseries database. I want to let users select the date range to load only the data in the selected time period. So I created StartDate parameter, and wrote the following query:

let
    Source = Sql.Database("projectbamosz.database.windows.net", "project_bamosz", [Query="select DATUM, ISIN_KOD, Netto_ESZKOZERTOK FROM dbo.timeseries"]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"DATUM", type date}}),
    #"Filtered Rows" = Table.SelectRows(#"Changed Type", each [DATUM] > StartDate)
in
    #"Filtered Rows"

 

I set the StartDate parameter so i get only a few thousand rows of data out of the 5 million. 

However my problem is, that after modifying the parameter and applying changes, and the apply query changes window appear, all the 5 millions rows load two times, and finally at third time only the queried rows are loaded. 

In this way it seems to be unnecessarily slow. Is there a way to make avoid loading all the rows before filtering it?

My goal is to let users get new reports quickly by changing sql queries with the parameters. and not to load all the original database again and again when the parameter is changed. 

 

Thanks a lot, 
Dominigo

2 ACCEPTED SOLUTIONS
lbendlin
Super User
Super User

There are a couple of things at play here (including inconsistency on Microsoft's side)  but you should seriously consider to use either Direct Query or Incremental refresh.

 

Here's a link you may find interesting: https://blog.crossjoin.co.uk/2020/07/05/why-is-power-bi-running-my-sql-query-twice/ 

 

 

 

View solution in original post

The code is a very good example of 'how to kill performance of a sql import' .  I'm not criticising, it's not always clear what power query does behind the scenes.

 

You want to help Power Query to use Query Folding to send the appropriate query back to the source system.

If you right-click in the 'Applied Steps' in Power Query, choose 'View Native Query', it'll be greyed out on all steps of the query.  This used to confuse me because the first SQL query should definitely fold (after all it is a native query).  I now understand that the initial step has 'View Native Query' greyed out because it's redundant to look at a query that you've just written.

The second and third step are greyed out because they don't fold.  This is damaging for performance because the filtering by date will take place in Power Query (after retrieving the 5 million rows).

 

2 solutions:

1- Rewrite the initial SQL query to include the date parameter in the statement.  This is more difficult with date parameters than text fields (so solution 2 may be easier for you) 

2- Don't use a SQL statement to connect.  Use the Power Query interface to connect to SQL, then navigate to the table, and then filter by the parameter. This is all in the ms powerbi documentation.  Just confirm at each step that 'View Native Query' is enabled. You might as well click it to confirm the query is what you want.

 

You will also want to confirm that there is an index on the date field in the SQL table.

 

-----------

A Refresh may work differently.  The Mashup engine is probably going to get involved (This is a more complex area - hopefully the above changes provide the performance improvement needed)

View solution in original post

2 REPLIES 2
lbendlin
Super User
Super User

There are a couple of things at play here (including inconsistency on Microsoft's side)  but you should seriously consider to use either Direct Query or Incremental refresh.

 

Here's a link you may find interesting: https://blog.crossjoin.co.uk/2020/07/05/why-is-power-bi-running-my-sql-query-twice/ 

 

 

 

The code is a very good example of 'how to kill performance of a sql import' .  I'm not criticising, it's not always clear what power query does behind the scenes.

 

You want to help Power Query to use Query Folding to send the appropriate query back to the source system.

If you right-click in the 'Applied Steps' in Power Query, choose 'View Native Query', it'll be greyed out on all steps of the query.  This used to confuse me because the first SQL query should definitely fold (after all it is a native query).  I now understand that the initial step has 'View Native Query' greyed out because it's redundant to look at a query that you've just written.

The second and third step are greyed out because they don't fold.  This is damaging for performance because the filtering by date will take place in Power Query (after retrieving the 5 million rows).

 

2 solutions:

1- Rewrite the initial SQL query to include the date parameter in the statement.  This is more difficult with date parameters than text fields (so solution 2 may be easier for you) 

2- Don't use a SQL statement to connect.  Use the Power Query interface to connect to SQL, then navigate to the table, and then filter by the parameter. This is all in the ms powerbi documentation.  Just confirm at each step that 'View Native Query' is enabled. You might as well click it to confirm the query is what you want.

 

You will also want to confirm that there is an index on the date field in the SQL table.

 

-----------

A Refresh may work differently.  The Mashup engine is probably going to get involved (This is a more complex area - hopefully the above changes provide the performance improvement needed)

Helpful resources

Announcements
PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

Top Solution Authors
Top Kudoed Authors