Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi everyone, I'm having some difficulty with a new dashboard I'm working on. This is my first time implemenenting a star schema but I have one central "fact" table surrounded by "dimension" tables, linked with unique keys. I have several columns of survey data in the fact table that need to be unpivoted but this is causing duplicates of all the other value fields in the fact table. Here is a mock up of my starting data:
ID | Value | Survey1 | Survey2 |
1 | 10 | 1.1 | 2.1 |
1 | 10 | 1.11 | 2.11 |
2 | 20 | 1.2 | 2.2 |
And here it is after the unpivot:
ID | Value | Survey | Answer |
1 | 10 | Survey1 | 1.1 |
1 | 10 | Survey2 | 2.1 |
1 | 10 | Survey1 | 1.11 |
1 | 10 | Survey2 | 2.11 |
2 | 20 | Survey1 | 1.2 |
2 | 20 | Survey2 | 2.2 |
As you can see, after the survey columns are unpivoted, there are duplicate values in the value column. So, for example, if I want a card that shows the sum of an ID's value column, its much bigger than it should be. How do I work around this without creating another table and ruining the star schema? Happy to provide more context, thanks!
Solved! Go to Solution.
Hi everyone, thanks for the suggestions! I actually found a solution from another post that does exactly what I need. Here is the formula I used in case anyone else has the same question:
Hi everyone, thanks for the suggestions! I actually found a solution from another post that does exactly what I need. Here is the formula I used in case anyone else has the same question:
There's nothing wrong with the design. When you know that values should only be summarized in specific ways (for example a SUM here makes no sense), you take that into account when creating the visualisations (by using the appropriate default summarization).
You can control how other users use the field by creating a measure (in this case it would be MIN(Value))
Okay well at least my data model is fine, still learning and all that...
However I'm not sure MIN(Value) is correct here. In my mock data, ID = 1 has two rows, so the correct sum would be 20, but if I used MIN(Value) I would get 10. I know there are probably quite a few ways to do this but what would you recommend? Also its worth noting that the real datasets are much more complicated and have multiple levels of unpivoting and aggregation. Thanks for the answer!
l get back to you (i'm in the middle of something right now).
You're quite right about MIN not being correct (I didn't really look at your data closely)
I wouldn't say there is a definitive answer . Whatever gets the job done in a reasonably efficient, performant manner is correct (taking account of good design practice as much as possible)
A few solutions:
1. A measure would be
DIVIDE(SUM(TableP[Value]), DISTINCTCOUNT(TableP[Survey]))
I think that produces the 20, 20 result to show against ID.
or 2. You could create a dimension to hold the ID and Total Value . This would be created in Power Query from the original 'pivoted' table with a GROUP BY 'id' and SUM the Value.
When creating a visual, if it's simply ID and Value just take the values from this dimension.
or 3. You could divide up the Value when unpivotting the table (assign a value to each line in the fact table to represent the share - so ID 2 would have 2 lines of value 10. That would make the aggregation easier , I think it would just be SUM(Value). I've no idea if this makes sense in the specific business scenario i.e. does ID 2 still have a value of 20 if there is only one survey?
Hope that helps.
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
49 | |
25 | |
20 | |
15 | |
12 |
User | Count |
---|---|
57 | |
49 | |
44 | |
18 | |
18 |