Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
Anonymous
Not applicable

Datasets - one large or many small

Hello,

I am trying to figure out the philosophy regarding datasets: should I create ONE big, general dataset or MANY small datasets?

The thing is I have 6 different departments and first idea was to share to their respective Microsoft Teams -workspace. But that means if I update one report, it has to be publishes to manu workspaces. But if I have one general for all, there will be difficulties with bi-directional relationships and also potential sales data would be accesses by other departments.

What is the best way to think about this? Also updating measures and everything means those has to be done to all datasets.

So 2 questions:
1. One big or many small datasets?
2. How to distribute correctly to many departments?


Thanks alot! This is really driving me nuts!
9 REPLIES 9
edhans
Super User
Super User

It may be a combination. You could have one big dataset with most or perhaps all of the data, and then share that dataset to the otehr departments. You can then restrict what they can and cannot do/see. So your primary data set can have Sales, but you don't expose Sales to the Engineering team for example. You need to have the new Gen2 workspaces to do this, so if you have Gen1, click the little upgrade button in the workspace settings.

 

There is no perfect solution to your issue - it is all about optimization, not perfection. Share Datasets has been a HUGE boon for me in this processes, but there are still times I completely segregate data and datasets. That, and I've not pulled the trigger on upgrading all of our workspaces to the new Gen2 type yet.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting
Anonymous
Not applicable

Yes, several departments uses the same kind of data.

However, RLS feels a little bit weird. When people open up the reports and just see blanks on one page?

 

But I think I will go back from multiple datasets to two datasets. Just a mess to keep track on measures and changes in measures between datasets.

 

Thank you so much to both of you!

Anonymous
Not applicable

Hi,

 

One thing you could do to customize reports for different departments but still use a common data model is to publish the report + model to the Power BI Service. Then you create a new report and connect it to the dataset in the service:

 

Skärmklipp.PNG

 

Best Regards

Ulf

Anonymous
Not applicable

Yes, but that still requires one BIG dataset, which means the reports will still be a bit slow compared to smaller ones.

 

But I guess it's all a matter of a balance. Can't have both.

Anonymous
Not applicable

Yes, it's a matter of balance.

 

How complicated calculations you have and the structure of the model is in many cases more important than the number of records. I have made dimensional models (with facts separated from dimension attributes) with up to a 100 million rows and the performance is still very good... And with partitioning (incremental refresh) avaiable in Power BI Pro updating of the data is no longer an issue for large models.

Anonymous
Not applicable

Seriously, did they remove that limitation like this week or something?

Seems to be open for Pro users now? 😍

EDIT: https://powerbi.microsoft.com/en-us/blog/incremental-refresh-is-generally-available/

Anonymous
Not applicable
Anonymous
Not applicable

I thought incremental was only for Premium? At least in dataflows.

I think there are optimizations to be made for my dataset though.

Thanks for the help!
Anonymous
Not applicable

Hi,

 

Different subject areas and data that is not related should go into different data models. But if several departments uses the same kind of data, let them share a large model with data from all departments. Then add roles to the model and use row level security to let the users see only the parts od the data they are allowed to see. You will then only need to deploy the report to one workspace/app and users from different departments can use it and see different numbers. 

 

To avoid problems with relationships, use some structured design strategies when creating the data model. Dimensional Modeling is a VERY good modeling teqnique for Power BI.

 

Good Luck!

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.