Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi,
I'm looking for an approach to create an 'one-time' effort in building a data model upfront, upon our source system, and to allow end-users to not have to worry about any data modelling but instead to simply connect to the modelled data and then spend their efforts producing the actual report, with a little DAX. Of course, producing a 'one-time' data model means that any data changes only need to take place in one place.
So how do I do this? Well, I've read the use of Dataflows. Dataflows seem to be exactly what I'm looking for. Connect to the different datasources, from within the cloud, complete any transformations, and then save it as a Dataflow so it available in the cloud to be consumed, as a datasource, by an end-user using the Desktop software (Desktop software so DAX can be used as it can't from the Service).
This sounds like a great feature except for one big problem (if I've understood it correctly). If I need to create a derived table then to refresh this derived table (computed entity) I would need to have Premium Capacity - which we don't have and will not have. Often I may create a derived table that is a result of two joined tables. It has to be a new derived table as the original two tables are used in their own right and therefore must remain the original.
Have I understood this correctly and do others find this a showstopper?
Then I ask, why not just use Datasets?
Use Desktop to connect to multiple datasources. Create as many derived tables as you need, as no Premium Capacity is required, and complete the 'one-time' data model. Once complete, publish that .bpix file (at this point it only contain Power Query work and no visuals) to the Power BI Service. The end-users can then connect to that Dataset and use it as the pre-built data model. Datasets do not need Premium Capacity to be able to refresh derived tables (aka computed entities).
Are others using Datasets, instead of Dataflows, for this purpose?
Thanks in advance.
Hi , @D_PBI
As for your question, I personally think it depends on the individual needs. Power BI dataflows and Power BI datasets do share some similarities .Both tools are available and can be used to solve various problems. Each implementation has it’s advantages and disadvantages and one should consider the right tool to use for the job.
Datasets are a tool designed for modeling the data for BI needs. In a dataset, you can manage your data, create computations and measures and implement modeling tools for BI. The output of a dataset is a Tabular Data model. These are highly compressed and optimized groups of tables that are very good for generating aggregated queries . Besides, I recommend browsing the following Dataset documentation.
Dataflows are designed as a sort of a cloud ETL process designed to assist you in connecting to various data sources, ingesting data and prepping it for BI.
The output of a Dataflow is a .CSV file stored in a generic file store. These are great reuse as a building block of many different datasets.
I recommend browsing the following introduction article And Dataflows documentation.
Best Regards,
Eason
Community Support Team _ Eason Fang
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
@v-easonf-msft thanks for your reply.
If I look at using a Dataset as a Tabular model and a Dataflow as a Data Warehouse then maybe I should be looking to create an structure like the below:
Dynamics365
\/
SQL Server > Dataflow (many Entities combining common data from Dynamics365 and SQL Server)
\/
Dataset (where I can re-model separate CDS Entities into a multi-dimensional model)
\/
Power BI (to cosnume the multi-dimensional Dataset and write DAX against it)
If following this approach then I will be keeping disperate system Entities (account, customer, etc..) as separate Entities in the Dataflow (i.e. the Data Warehouse) and then look to denormalise them in the Dataset (which, being a Tabular model, is what its purpose is).
Is this the purpose of the Dataflow and Dataset objects?
Thanks.
Hi, @D_PBI
Yes ,you can understand like that .
As we know, Power BI introduces dataflows to help organizations unify data from disparate sources and prepare it for modeling. Dataflows are designed to use the Common Data Model, a standardized, modular, extensible collection of data schemas published by Microsoft that are designed to make it easier for you to build, use, and analyze data. With this model, you can go from data sources to Power BI dashboards with nearly zero friction.
Dataflows enable you to get better control – and faster insights – from your business data.
By storing business data in the standardized format of the Common Data Model, you can create apps that generate quick, easy, and automatic visuals and reports. Those include, but aren’t limited to:
Mapping your data to standard entities in the Common Data Model to unify data and leverage the known schema to drive out-of-the-box insights
Creating your own custom entities to unify data across your organization
Use and refresh external data as part of a dataflow, and enable import of that data to drive insights.
Best Regards,
Eason
Community Support Team _ Eason Fang
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.