Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

mwegener

Incremental Refresh with Subqueries

Since the incremental refresh is already available with a Pro license, it was time for me to deal with this topic.

In the Power BI community, I found out that the known documentation describes how to set up an incremental refresh, but the examples are always based on a single query. In practice, however, combined queries from several subqueries are quite common.

At this point it should be noted that the incremental refresh is a function that you set up in Power Query and Power BI Desktop, but which is only used and controlled by data loading in the Power BI Service.

But first, let’s look at how we can take advantage of the incremental refresh.

By using the two filter parameters RangeStart and RangeEnd, we can control which period of data should be loaded from the source. This reduces the loading time and ensures a low data volume that has to be transported to the Power BI Service.
On the other hand, we can determine which period of data should be kept in the analysis memory.

20200328_02_Parameters.png

As we can see, there are two features that we need to consider when setting up incremental refresh.

  • First, we have to use the RangeStart and RangeEnd filter parameters in the subqueries as early as possible in order to limit the queried data in the source and thus save loading time.
  • Second, we have to use the filter parameters RangeStart and RangeEnd in the combined query to identify the data stock in the analysis memory that is to be retained.

I would like to illustrate it with a small example. In Business Central, the document information is stored in header and line tables, which we combine in a query for analysis.

In this example, we assume that we want to keep (save) the records with a posting date in the last 2 years and refresh records with a posting date in the last 7 days.

20200328_00_Refresh.png

The following diagram shows how the Power Query Engine loads the invoice header and line data of the last 7 days from Business Central and merges them into one invoice. Then the query result, not the underlying basic data of the subqueries, is saved in the analysis memory by the Power BI Service. If data from a previous data load already exists for this period, it will be deleted. If data exceeds the retention period of 2 years, it will also be deleted. To identify the corresponding data records, the filter parameters RangeStart and RangeEnd must also be set in the query result.

20200328_01_2_Diagram.png

In Power Query, the sales invoice header and line subqueries have been disabled for loading to the report, and only the combined sales invoice query is loaded to the report.

20200328_01_QueryDependencies.png

Here, we first filter the subqueries sales invoice header and row, in order to retrieve only the required period from Business Central. To do this, we set a Date Filters > “Between…” in the PostingDate column.

20200328_03_3_FilterSubquery.png

If the column is of the “Date/Time” data type, we can use the RangeStart and RangeEnd parameters as filter values in the dialog. In this case, since the PostingDate column is of the “Date” data type, we first have to store fixed values.

20200328_03_4_FilterSubqueryDialog.png

We then replace the fixed values in the formula bar.
For this we convert the parameters RangeStart and RangeEnd with the function Date.From from Date/Time to Date.
You should not change the data type of the column to ensure that the data is filtered by the source and not by the Power Query Engine.

20200328_03_2_FilterSubQuery.png

Then we filter the combined sales invoice query so that the analysis memory can identify which data records in the memory should be retained or replaced.

20200328_04_FilterCombinedQuery.png

Now we can set up the incremental refresh in Power BI Desktop and publish the report to Power BI Service.
The “Incremental refresh” dialog is only displayed for the query (table) if the final query uses the RangeStart and RangeEnd parameters.

20200328_05_Configuration-1536x832.png

From now on, the Power BI Service takes over control via the RangeStart and RangeEnd parameters and automatically sets them for the first full load and the subsequent incremental refreshes.

 

IMPORTANT: Reports with an incremental refresh cannot currently be downloaded from the Power BI Service.

Comments

I think I got lost but to make sure I understood it correctly you did the following in the order ??

  • Added all the tables in the PBI desktop
  • Filtered both the staging tables (Invoice Header, Invoice Line) using the Parameters
  • Disabled Load on both the staging tables, Hide them from Report View as well
  • Created the combined Invoice table (using Merge Queries)
  • Applied filter to this Invoice table with Parameters
  • Enabled incremental refresh only on Invoice table

Let me know if this is accurate @mwegener 

Hi @anilthapa11 ,

yes that is correct.

Hi
I have one data set , prepared by using multiple tables, each table having updated_date, now which date colun need to take for incremental refresh.
Example.
Table1,Table2,Table3 joined Main Table prepared
Main Table having Updated_Date1(Table1),Updated_Date2(Table2),Updated_Date3(Table3),
how shall i proceed please help me on this.