Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.
Hello,
I am newbie in PowerBI, i need to make a choice on type of connection to choose for my power bi reports. I have approx 50 DataSet with each having 4M records. I have read articles and found that "Import" will be faster. For now my report takes approx 45 min to refresh the data, i need to understand how the processing is done inside PowerBI for schedule refresh.
Does it truncate the previous imported data and reloads it or it updates some of the data and reinsert the new entry?
Can i schedule 50 DataSet to refresh at same time, how it will be processed parallel/sync way? What kind of issues i will run in.
Any help will be much appreciated!!!!
Hi @Anonymous
First, Power BI refresh the data through data gateway, if you want to refreh multiple datasets at a time then, it is recommended to create multiple datasources in order to load balance (refer the doc below)
https://docs.microsoft.com/en-us/data-integration/gateway/service-gateway-high-availability-clusters
incremental data refresh is good, pls refer this to know in detail https://youtu.be/CRQjoFw8z0Q
but, If you think in future it will grow quickly like more than 10 million records then I prefer to go with direct query.
Thanks & Regards,
Mohammed Adnan
Learn Power Plaform using my Youtube Channel TAIK18 click on the name
@mohammedadnant : Please explain what do you mean by multiple data sources, i have only 1 data Source MS SQL.
I did try direct query and it's working too slow and complete data does not render either in my visual. It gives error "The query result is too large. Consider removing unused columns, adding visual filters, or reducing the number of string type columns." I cannot remove any further columns they are needed.
The DBA says that these are indexed as well 😞
Does anyone have tried changing these setting once it's depoyed in Service?
Hi @Anonymous ,
Have you read the offical document?
https://docs.microsoft.com/en-us/power-bi/connect-data/incremental-refresh-overview
You will need to add parameter for dataset in Power BI Desktop. After publishing it to Power Bi Service, you could change the value of parameter when setting incremental refresh.
Best Regards,
Jay
Hi Jay,
i am talking about highlighted setting, can we change these years ? These are not the parameters
Hi @Anonymous ,
Sure you can. It depends on the "Order Date" column in your dataset as you can see above.
Best Regards,
Jay
Hi Kavs!
First, a distinction: I assume when you say you have 50 Datasets, you mean 50 tables. Below I'll say 'dataset', meaning the data loading into Power BI.
Import may not be faster, depending on what you're hoping to do with your data and the number/type of relationships you're utilizing. In general, a 45 minute refresh is considered a bit long. A suggestion (based on experience): remove any columns that are redundant or which you won't use in a given report from the tables.
You can make multiple reports from a single dataset, but I suggest different datasets if the reports are using vastly different data.
You can greatly shorten the time a refresh takes by implementing 'incremental refresh' in Power BI. This is available at least for Premium and PPU (Premium Per User) plans, and I believe they added it to Pro plans recently. Incremental Refresh requires setting up parameters and building out a piece in PBI Desktop, then setting up scheduled refresh in the Service (online).
Data can be truncated, with one likely scenario being that you load in only the past month or quarter of data, based on a date field. For your question specifically, if you do not use incremental refresh the entire dataset will be dropped and reloaded into memory.
You can only refresh a certain number of datasets at a time, varying based on the number of capacities purchased.
refresh basics:
Data refresh in Power BI - Power BI | Microsoft Docs
incremental refresh:
Incremental refresh for datasets in Power BI - Power BI | Microsoft Docs
Hope this quick bit helps!
@wwhittenton : Thank You for detailed explaination. 🙂
Yes i have 50 tables ( 1 table in each .pbix) but there is no relationships between them. These table is shown as tabular visual. Unfortunately, there are too many columns that i have to show inside table visual though there is no redundant data.
User need to see entire data in the tabular visual so i don't have option to filter it based Year/Quarter.
Do you have any other recommendation?
I am trying incremental but there will be lot of effort in old system where the data comes from.
Also can you elaborate
You can only refresh a certain number of datasets at a time, varying based on the number of capacities purchased.
You're welcome! I'm still learning too, but been around the product ~4 years now.
I recommend splitting your tables into a Star Schema. It'll take more work up front but will much improve the performance of the reports. Making no assumptions of your knowledge, here's an explanation: Power BI Basics of Modeling: Star Schema and How to Build it - RADACAD
A warning - if you don't do a star schema, slicers might be really slow. When you open them, they scan the entire table for distinct values you can slice the data by.
Incremental Refresh should make your refreshes much faster - if you can update just yesterday or the past week everything will run very quickly up front. The data engine in PBI (VertiPaq) can handle huge data loads. 4m rows isn't going to swamp it.
Data refreshes -
If you are on Pro or PPU plans (I think; I've never used PPU), your data will refresh in the "shared" capacity - meaning it might take a little longer before the system pushes you through the queue. Generally that's not an issue, but don't expect it to let you do all 50 datasets at once.
If you have Premium, you'll have purchased a capacity or multiple. Those plans range from A1 to A6, with each having a different number of "vCores", basically processing power. Use "Backend vCores" * 1.5 (rounded up) to see the max # of concurrent refreshes on Premium.
More Pro vs PPU vs Premium comparison:
Pricing & Product Comparison | Microsoft Power BI
Best,
In case of incremental refresh, do i have the ability to change Incremental Refresh settings after it has been published in BI Service. How this will take effect once i make the changes?