Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!
I have a few Dataflows in a workspace that's not backed by Premium capacity. These Dataflows retrieve multiple .csv files from some Azure Blob Containers (residing in the same region as the Power BI service) then transform/extract/combine the data in those files into various entities. As there's no Premium, there is no linked entities, but there are multiple "not load enabled" entities serving as reusable intermediate steps. As the number of entities grow, I was looking into wasy to optimize my query to reduce data load time, yet I seem to find something very intruguing: no matter how simple or complicated the transformation behind an entity, the minimum time it takes to render one entity is 30s. An entity that is as simple as directly reading a 10-row-2-column reference table may take 31s, while a big fact tables that takes 3 to 4 intermediate tables with some joins and lookups may take 34s. So I just wonder, is this "minimum 30s" phenomon a result of:
Note that the time mentioned above are as recorded in the "refresh history" modal window from sheduled and on-demand refreshes, not the time it takes to preview refreshes during query authoring.
I would also be interest to hear from other users. Are you experiencing the same pattern? How about the user who are running Dataflows in Premium? Are your dataflows running faster?
Hi @Anonymous ,
Theoretically speaking, to run dataflow in premium workspace, it should be faster. Power BI Premium provides dedicated and enhanced resources to run the Power BI service for your organization. So you will get greater scale and performance. For more details, please check the online document.
Thanks for your reply. Would you please provide me with any definitive answer on why the minimum time required to refresh each entity, no matter how small the size or how simple the transformation, is 30s?
Hi @Anonymous ,
As the online document, there is no instruction about that. Power BI dataflows use the Power BI data refresh process to keep your data up to date. So it is impacted by many factors like network , performance of underlying datasource and size of data etc..
With respect, I wouldn't consider this one-size-fits-all generic answer where you said the dataflow refresh is "impacted by many factors like network , performance of underlying datasource and size of data etc" acceptable. I understand that this must be a contributing factor, but it definitely wouldn't explain the whole thing. The reasons are two folds:
It would be really helpful if you can either:
I see your response was thrown in the "too hard to respond bucket"