Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Anonymous
Not applicable

How does Schedule Refresh internal logic work in Power BI

Hello,

I am newbie in PowerBI, i need to make a choice on type of connection to choose for my power bi reports. I have approx 50 DataSet with each having 4M records. I have read articles and found that "Import" will be faster. For now my report takes approx 45 min to refresh the data, i need to understand how the processing is done inside PowerBI for schedule refresh.

 

Does it truncate the previous imported data and reloads it or it updates some of the data and reinsert the new entry?

Can i schedule 50 DataSet to refresh at same time, how it will be processed parallel/sync way? What kind of issues i will run in.

 

Any help will be much appreciated!!!!

 

 

10 REPLIES 10
mohammedadnant
Impactful Individual
Impactful Individual

Hi @Anonymous 

 

First, Power BI refresh the data through data gateway, if you want to refreh multiple datasets at a time then, it is recommended to create multiple datasources in order to load balance (refer the doc below)

https://docs.microsoft.com/en-us/data-integration/gateway/service-gateway-high-availability-clusters

 

incremental data refresh is good, pls refer this to know in detail https://youtu.be/CRQjoFw8z0Q

 

but, If you think in future it will grow quickly like more than 10 million records then I prefer to go with direct query.

 

Thanks & Regards,

Mohammed Adnan

Learn Power Plaform using my Youtube Channel TAIK18 click on the name

Did I answer your question? Mark my post as a solution! Appreciate your Kudos!! !!

Thanks & Regards,
Mohammed Adnan
Learn Power BI: https://www.youtube.com/c/taik18
Anonymous
Not applicable

@mohammedadnant : Please explain what do you mean by multiple data sources, i have only 1 data Source MS SQL.

 

I did try direct query and it's working too slow and complete data does not render either in my visual. It gives error "The query result is too large. Consider removing unused columns, adding visual filters, or reducing the number of string type columns." I cannot remove any further columns they are needed.

 

The DBA says that these are indexed as well 😞

Anonymous
Not applicable

Does anyone have tried changing these setting once it's depoyed in Service?

Hi @Anonymous ,

 

Have you read the offical document?

https://docs.microsoft.com/en-us/power-bi/connect-data/incremental-refresh-overview 

You will need to add parameter for dataset in Power BI Desktop. After publishing it to Power Bi Service, you could change the value of parameter when setting incremental refresh.

 

Best Regards,

Jay

Community Support Team _ Jay
If this post helps, then please consider Accept it as the solution
to help the other members find it.
Anonymous
Not applicable

Hi Jay,

 

i am talking about highlighted setting, can we change these years ? These are not the parameters

 

Kavs30_0-1622618875528.png

 

Hi @Anonymous ,

 

Sure you can. It depends on the "Order Date" column in your dataset as you can see above.

 

Best Regards,

Jay

Community Support Team _ Jay
If this post helps, then please consider Accept it as the solution
to help the other members find it.
wwhittenton
Helper II
Helper II

Hi Kavs!

 

First, a distinction: I assume when you say you have 50 Datasets, you mean 50 tables. Below I'll say 'dataset', meaning the data loading into Power BI.


Import may not be faster, depending on what you're hoping to do with your data and the number/type of relationships you're utilizing. In general, a 45 minute refresh is considered a bit long. A suggestion (based on experience): remove any columns that are redundant or which you won't use in a given report from the tables.

 

You can make multiple reports from a single dataset, but I suggest different datasets if the reports are using vastly different data.

 

You can greatly shorten the time a refresh takes by implementing 'incremental refresh' in Power BI. This is available at least for Premium and PPU (Premium Per User) plans, and I believe they added it to Pro plans recently. Incremental Refresh requires setting up parameters and building out a piece in PBI Desktop, then setting up scheduled refresh in the Service (online).

 

Data can be truncated, with one likely scenario being that you load in only the past month or quarter of data, based on a date field. For your question specifically, if you do not use incremental refresh the entire dataset will be dropped and reloaded into memory.

 

You can only refresh a certain number of datasets at a time, varying based on the number of capacities purchased. 

 

refresh basics:

Data refresh in Power BI - Power BI | Microsoft Docs

 

incremental refresh:

Incremental refresh for datasets in Power BI - Power BI | Microsoft Docs

 

Hope this quick bit helps!

Anonymous
Not applicable

@wwhittenton : Thank You for detailed explaination. 🙂

 

Yes i have 50 tables ( 1 table in each .pbix) but there is no relationships between them. These table is shown as tabular visual. Unfortunately, there are too many columns that i have to show inside table visual though there is no redundant data.

 

User need to see entire data in the tabular visual so i don't have option to filter it based Year/Quarter.

 

Do you have any other recommendation?

 

I am trying incremental but there will be lot of effort in old system where the data comes from.

 

Also can you elaborate 

You can only refresh a certain number of datasets at a time, varying based on the number of capacities purchased. 

 

 

You're welcome! I'm still learning too, but been around the product ~4 years now.

 

I recommend splitting your tables into a Star Schema. It'll take more work up front but will much improve the performance of the reports. Making no assumptions of your knowledge, here's an explanation: Power BI Basics of Modeling: Star Schema and How to Build it - RADACAD

 

A warning - if you don't do a star schema, slicers might be really slow. When you open them, they scan the entire table for distinct values you can slice the data by.

 

Incremental Refresh should make your refreshes much faster - if you can update just yesterday or the past week everything will run very quickly up front. The data engine in PBI (VertiPaq) can handle huge data loads. 4m rows isn't going to swamp it.

 

Data refreshes -

If you are on Pro or PPU plans (I think; I've never used PPU), your data will refresh in the "shared" capacity - meaning it might take a little longer before the system pushes you through the queue. Generally that's not an issue, but don't expect it to let you do all 50 datasets at once.

If you have Premium, you'll have purchased a capacity or multiple. Those plans range from A1 to A6, with each having a different number of "vCores", basically processing power. Use "Backend vCores" * 1.5 (rounded up) to see the max # of concurrent refreshes on Premium.

 

wwhittenton_0-1622122797935.png

 

More Pro vs PPU vs Premium comparison:

Pricing & Product Comparison | Microsoft Power BI

 

Best,

 

Anonymous
Not applicable

In case of incremental refresh, do i have the ability to change Incremental Refresh settings after it has been published in BI Service. How this will take effect once i make the changes?

 

Kavs30_0-1622134529214.png

 

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors