Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
I have difficulties making my data model a star schema. The report is mainly about showing the consumption of material needed for manufacturing our products.
Tables
Outs only has rows from this and last year as this is sufficient for the report. Stock is not filtered by date and is thus based on all rows.
Could you please tell me how I can make a star schema out of this? As what kind of tables should I treat stock, last vendor and ordered? How should I make the connections? Should I maybe just merge these tables with the item table?
Solved! Go to Solution.
Hi @TimmK ,
the best practices refer to Power BI as a model based tool and not as a report based tool. The goal here is to design a data model that can be applied to various questions and reports.
With a good data model you have common dimensions by which you can view your different metrics. At the same time, your model should be stable in case of changes, e.g. if you need to add another dimension or metric.
For example, the date dimension is one of the most important dimensions from my point of view and should be linked to almost every fact table. So, if modeled correctly, you can determine the inventory at any point in time (running total), your consumptions or receipts at certain points in time, and the delivery quantity at the delivery date.
A good BI data model should give analysis capabilities and not just solve a reporting question.
Hi @TimmK ,
To build a reasonable star model, you first need to understand its design principles: Dimension tables support filtering and grouping, Fact tables support summarization. There are cardinal attributes and screening directions, both of which also play a key role in the construction of the model. Based on the above and refer to the official documents, I hope to help you.
Here is the reference link:
Understand star schema and the importance for Power BI - Power BI | Microsoft Docs
If the problem is still not resolved, please provide detailed error information and let me know immediately. Looking forward to your reply.
Best Regards,
Henry
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Hi Henryk,
Thank you, I already know the reference, but this does not really help in my practical example.
For example, is a table that is originally a fact table still a fact table after I have aggregated it with the query editor with a group by function? Because then it only has rows with a unique PK and in this case it would be a 1:1 relationship to the item table. Thus, I wonder if I should just merge it with the item table?
Instead of me providing answers, let me provide questions.
what are your transactions and data tables? This is where you start. I think you have
consumption
stock
orders.
Don't assume you need to use the same names as your source system. You can call it "outs" if you like, but to me it is "consumption"
these are all fact tables.
Next step is to determine if you need 3 fact tables, or if you can combine these into 2 or 1 table. I think 3 is fine.
then ask, who, what, when, where.
who: vendor
what: item
when: date
where: not sure.
Keep the minimum i of about who, what, when, where in the fact table: just the keys. Put all the detail in the dim tables.
these are general guidelines. I hope it helps.
Thank you, Matt.
Yes, Consumption may be a better table name.
So would you say fact tables with distinct rows only are still considered fact tables? Stock, Last Vendor and Ordered have been reduced by GROUP BY or remove duplicates in the query editor. Thus, each row of them has only exactly one unique item key.
Together with the Consumption table I would then have four fact tables (Consumption, Stock, Last Vendor, Ordered).
I agree, Item, Vendor and Date must be the DIM tables. The where question is not relevant for this report, I think.
Do you think the following would be the star schema according to best practice then? Or would you do something differently?
Hi @TimmK ,
the best practices refer to Power BI as a model based tool and not as a report based tool. The goal here is to design a data model that can be applied to various questions and reports.
With a good data model you have common dimensions by which you can view your different metrics. At the same time, your model should be stable in case of changes, e.g. if you need to add another dimension or metric.
For example, the date dimension is one of the most important dimensions from my point of view and should be linked to almost every fact table. So, if modeled correctly, you can determine the inventory at any point in time (running total), your consumptions or receipts at certain points in time, and the delivery quantity at the delivery date.
A good BI data model should give analysis capabilities and not just solve a reporting question.
Thank you, @mwegener
This makes sense to me in general having this kind of model perspective being able to answer various questions, also those that may not be seen yet.
However, could possible performance improvements also be important/useful in some cases? For example, when I aggregate in Power Query I can reduce the rows from let's say 1.5 million to just a few ten thousand rows. This may result in reducing overall loading time four times compared to no aggregation. So maybe one could trade-off universal flexibility/changeability with better performance in some cases?
Kimball briefly describes here "Aggregate Fact Tables" and their use to accelerate query performance:
Aggregate Fact Tables or Cubes | Kimball Group
Hi @TimmK,
I would consider the performance improvement secondary. Since Power BI's analytics database is optimized for aggregating data. The main focus should be on usability and maintainability at the beginning. In my view, an increased effort for performance is only justified in case of concrete performance problems.
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
110 | |
94 | |
81 | |
66 | |
58 |
User | Count |
---|---|
150 | |
119 | |
104 | |
87 | |
67 |