Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
mark_carlisle
Advocate IV
Advocate IV

Advice required on incremental refresh or disable loading of tables

I have 3 data sources;

 

Surveys - Excel file

Incident - SQL

IncidentDetail - SQL

 

The data that I'm interested in is the customer feedback in Surveys. Now from what I understand incremental refresh on Excel files essential doesn't work as it loads the whole thing to then determine whats needs adding so you might as well just refresh the whole thing anyway.

 

The Excel file does not contain the Incident Reference and there has been a request to add it. The Excel file does however contain a IncidentDetailId which is in the IncidentDetail table. The IncidentDetail table contains an IncidentId which links to the Incident table that contains the Incident Reference, so no problems load Incident and  IncidentDetail in to PBI and join them, PQ below;

 

let
    Source = Sql.Database("SQLDATABASE", "SQLDATABASE"),
    dbo_IncidentDetail = Source{[Schema="dbo",Item="IncidentDetail"]}[Data],
    #"Filtered Rows" = Table.SelectRows(dbo_IncidentDetail, each [CreateDate] >= StartDateOfSurveys),
    #"Removed Other Columns" = Table.SelectColumns(#"Filtered Rows",{"IncidentDetailId", "IncidentId"}),
    #"Merged Queries" = Table.NestedJoin(#"Removed Other Columns",{"IncidentId"},Incident,{"IncidentId"},"Incident",JoinKind.LeftOuter),
    #"Expanded Incident" = Table.ExpandTableColumn(#"Merged Queries", "Incident", {"IncidentReference"}, {"IncidentReference"}),
    #"Removed Other Columns1" = Table.SelectColumns(#"Expanded Incident",{"IncidentDetailId", "IncidentReference"})
in
    #"Removed Other Columns1"

The above is using query folding so I know is as efficient as it can be. I then Merge the Surveys with IncidentDetail and Expand to bring in the Incident Reference. All works. No problems.

 

However, I want to make this more efficient and I'm stuck with what to do. I've disabled the load of Incident and IncidentDetail as I do not need them in the model, but, unless I'm mistaken, every time the report gets refreshed its going to pick up all the data in those tables (approx 5m rows)? Would it not be better to have them load into the model but apply incremental refresh so following the initial load only new data is added?

1 ACCEPTED SOLUTION

Hi @mark_carlisle,

 

That could depend on your scenario. For instance, 

1. If you only need to refresh once and you set the scheduled refresh in the middle night, I would suggest you not to load these queries. 

2. If you need to refresh the report in the work hours, I would suggest you try the incremental refresh. The refresh speed could be better. Maybe you don't need to merge a big table in this scenario.

 

 

Best Regards,

Community Support Team _ Dale
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

4 REPLIES 4
v-jiascu-msft
Employee
Employee

Hi @mark_carlisle,

 

I would suggest you try out the premium feature, service-premium-incremental-refresh

Since you disable the load of these queries, why did you include them? You can't even use them in the reports.

 

Best Regards,

Community Support Team _ Dale
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

I connect to them as they hold data that is not available in the Surveys Excel file, but I do not require them to be loaded into the model as I can merge the data then contain with the Surveys query based on the unique keys held in the tables. Apologies if that was not clear in the initial post.

 

I'm familiar with how to setup incremental refresh, the question is really around what is better in terms of performance, either;

 

  • Load all 3 tables in to the model. Allowing me to turn on incremental refresh for Incident and IncidentDetail. This would allow only the most recent data to be imported on subsequent refreshes but loads the tables into the model.
  • Leave things as they are, Incident and IncidentDetail do not load but presumably perform a full refresh of their data so it can be merged with Surveys to bring in the Incident Reference.

Hi @mark_carlisle,

 

That could depend on your scenario. For instance, 

1. If you only need to refresh once and you set the scheduled refresh in the middle night, I would suggest you not to load these queries. 

2. If you need to refresh the report in the work hours, I would suggest you try the incremental refresh. The refresh speed could be better. Maybe you don't need to merge a big table in this scenario.

 

 

Best Regards,

Community Support Team _ Dale
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Option 1 for me then. Thanks.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.