The connection between Dataflow and Dataset refreshes could be automated using the API and MS Flow. Something like this:
1. Trigger your dataflow refresh via the API, for instance using PowerShell, which you could host/run from Azure Functions
Incidentally this approach opens you up more fine-grained control of your refreshes than using the static scheduling options in the Power BI service. For instance you could trigger your dataflow refreshes based on sources being updated, or integrate your Power BI refreshes in broader Azure Data Factory processes:
2. Use NotifyOption in the API call above with MailOnCompletion sending an email to an inbox monitored by a Flow workflow. This email is your Flow trigger.
3. Trigger your "client" dataset refresh via the Power BI REST API as the follow-up action in the same Flow, as explained here:
4. Push a notification (email, RSS, Teams etc.) once the thing is complete, based on a similar flow triggered by the dataset refresh's success (or failure). A side benefit of handling your own notifications is that you get to choose the format, content, notification channel, and recipient(s), whereas the Power BI service hard-codes all of these.
I dont understand... why do we have to Schedule the Dataset? From all the Microsoft articles that I read about Dataflow, it was made to look like dataset will be automatically connect to Dataflow and show users the latest data. Does this mean if I create 10 datasets using my Enterprise Dataflow, I need to configure all my 10 Datasets to refresh sepeartely? On top, we have to rely on API to get faster updates? This is totally unproductive and totally beats the purpose of having all my data on Cloud services already!
I hope this is all not true and I dont have to keep having separet schedules Dataflow and Datasets!
@SamRock dataflows are not pushing data directly to datasets, they're just making the data available in a datalake. You might even have a dataflow that does not have a destination dataset. You do have to schedule refreshes in your datasets separately from your dataflows, just like you have to schedule refreshes against any other data source.
@otraversThanks for the response. I still feel this defeats the purose of doing the ETL on PowerBI's own storage where my Datasets also reside. The purpose of Dataflow was to have a Centralized data source that all reports can consume. Now its basically telling me to take casre of indivudal report refreshes, inspite of having everything in a "centralized" place?! Ideally Dataset should work like a "Direct" Querying to Dataflow, not an Import
@SamRock wrote: Ideally Dataset should work like a "Direct" Querying to Dataflow, not an Import
Let me try and explain why I disagree. That is really not "ideal", as Direct Query is meant to be used over structured data sources, not unstructured data lakes. Azure Data Lake is not supported as a source for DQ:
Moreover Import performs better than DQ and is really Power BI's preferred mode unless you have too much data or need real-time updates.
You can make the case that it should be easier to sync dataflow and dataset refreshes, but I think you're possibly confused about some of the architectural options offered by Power BI. The fact that you're using dataflows as a data source in a dataset should not limit how that dataset can work or be refreshed.