Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Greg_Deckler

Design Pattern - Groups and Super Groups!

The Challenge

A fairly common data model design pattern can be described such that:

  • Given a set of criteria that are numerically scored
  • These scored criteria are grouped together in arbitrary sets and the group scored via an aggregation of criteria scores
  • These groups of scores are further grouped into arbitrary sets (super groups) and the super group scored via an aggregation of group scores

For example, a restaurant rating system might uses a ranking criteria of satisfactory from 0-100. These questions could be grouped into questions about "Service", "Atmosphere", "Quality", "Value" and "Cleanliness" where each group's score is simply an average of the individual question scores. These five groups are then grouped into "Restaurant" and "Food" where the score for each of these super groups is the minimum score of related sub-groups.

 

The Data Model

The data model to implement this design is relatively straight-forward. 

 

[SuperGroups] 1-* [Groups2SuperGroups] 1-1 [Groups] 1-* [Attributes2Groups] 1-* [AttributeScores]

 

We will use the following columns and data to build a sample data model via "Enter Data" queries:

 

AttributeScores

AttributeScores contains two columns, "Attribute" and "Score".

 

AttributeScore
Attribute 1100
Attribute 275
Attribute 380
Attribute 4100
Attribute 560
Attribute 630
Attribute 740
Attribute 835
Attribute 975
Attribute 1090

 

Groups

Groups contains a single column "Group"

 

Group
Group1
Group2
Group3
Group4

 

SuperGroups

SuperGroups contains a single column "SuperGroup":

 

SuperGroup
SuperGroup1
SuperGroup2

 

Attributes2Groups

Attributes2Groups contains 2 columns "Group" and "Attribute":

 

GroupAttribute
Group1Attribute1
Group1Attribute2
Group1Attribute3
Group2Attribute4
Group2Attribute5
Group3Attribute6
Group3Attribute7
Group3Attribute8
Group4Attribute9
Group4

Attribute10

 

Groups2SuperGroups

Groups2SuperGroups contains two columns, "Group" and "SuperGroup":

 

SuperGroupGroup
SuperGroup1Group1
SuperGroup1Group2
SuperGroup2Group3
SuperGroup2Group4

 

A Simple Solution

Create an AverageScore column in Groups:

 

AverageScore = CALCULATE(AVERAGE(AttributeScores[Score]),RELATEDTABLE(AttributeScores))

Groups data is now:

Groups

GroupAverageScore
Group185
Group280
Group335
Group482.5

 

Create a MinScore column in SuperGroups:

MinScore = CALCULATE(MIN(Groups[AverageScore]),RELATEDTABLE(Groups))

SuperGroups data is now:

SuperGroups

SuperGroupMinScore
SuperGroup180
SuperGroup235

 

 

Adding Complexity

What happens when we change our core AttributeScores table to include an additional field so that we can store multiple "things" in the same table:

 

 

AttributeScoreThing
Attribute1100Thing1
Attribute275Thing1
Attribute380Thing1
Attribute4100Thing1
Attribute560Thing1
Attribute630Thing1
Attribute740Thing1
Attribute835Thing1
Attribute975Thing1
Attribute1090Thing1
Attribute150Thing2
Attribute225Thing2
Attribute330Thing2
Attribute450Thing2
Attribute510Thing2
Attribute650Thing2
Attribute750Thing2
Attribute850Thing2
Attribute990Thing2
Attribute1090Thing2

 

Note that the first 10 entries have remained the same other than the addition of "Thing1" in the "Thing" column. Now we have a representation of the same survey or scoring conducted on multiple "things".

 

Houston, We Have a Problem

By simply adding this additional field, we might think that we have not changed the data model to such a degree that our simple solution will suffice. No such luck.

 

If we simply put our Group and AverageScore in a table visual and add a slicer for "Thing", we get the same numbers for each Thing.

 

GroupAverageScore
Group160
Group255
Group342.5
Group486.25

 

When we should get:

 

Thing1

GroupAverageScore
Group185
Group280
Group335
Group482.5

 

Thing2

GroupAverageScore
Group135
Group230
Group350
Group490

 

Similarly, if we add a simple table visualization of SuperGroups to show "SuperGroup" and "MinScore", regardless of slicer selection, these remain:

 

SuperGroupMinScore
SuperGroup155
SuperGroup242.5

 

If we try using row level security instead of slicers, the results are the same. So what is going on? Basically, calculated columns are essentially static in nature, calculated upon data refresh and are not updated contextually. Note that I realize that if you use the original "Score" value from AttributeScores and choose the dynamic "Average" aggregation that it works at the group level, but this only holds for very simplistic aggregations.

 

A Slightly More Complex Solution

In order to solve this problem, we switch from calculated columns to measures.

 

Back to the Groups table (or anywhere), we can create a measure defined such that:

AverageScoreMeasure = CALCULATE(AVERAGE(AttributeScores[Score]),RELATEDTABLE(AttributeScores))

This is EXACTLY the same DAX calculation as we had for our calculated column. However, the calculation for MinScore is dramatically different. For the MinScore as a measure, we are trying to find the minimum score of a measure and thus we must return the measure AverageScoreMeasure to our measure calculation within the correct context. Thus, to do this we define MinScoreMeasure such that:

 

MinScoreMeasure = MINX ( SUMMARIZE ( Groups, Groups[Group] , "AVG",[AverageScoreMeasure] ), [AVG])

What is going on here is that we are creating a summarized table using the SUMMARIZE function. We are returning two columns, the first is a grouping by our "Groups" column called "Groups" and the second is our AverageScoreMeasure calculation in a column called "AVG". We are then taking the MIN of column AVG.

 

Now, when we create table visualizations using our "Group" and "AverageScoreMeasure" in one table and our "SuperGroup" and "MinScoreMeasure" in anther table, we receive the correct results for "Thing1" and "Thing2" whether using slicers or RLS.

 

Thing1

Groups

GroupAverageScoreMeasure
Group185
Group280
Group335
Group482.5


SuperGroups

SuperGroupMinScoreMeasure
SuperGroup180
SuperGroup235

 

Thing2

Groups

GroupAverageScoreMeasure
Group135
Group230
Group350
Group490

 

SuperGroups

SuperGroupMinScoreMeasure
SuperGroup130
SuperGroup250

 

Conclusion

The designer pattern described here has many applications to just about any scenario in which scores or rankings need to be aggregated and then aggregated again. The solutions described within this article handle both simple scenarios as well as complex scenarios involving multiple base scoring sets of data. This scenario demonstrates the limitations of custom columns and the corresponding superiority of measures for handling aggregation roll-ups.

Comments

Great article and I'm happy to have made a small contribution to this pattern.

Can any one help me out on How to work on the Rest API in Power BI, More Guide lines in R Script as well as DAX Expression

Can any one help me out on How to work on the Rest API in Power BI, More Guide lines in R Script as well as DAX Expression

 

Thanks in Advance,

Pratima

Groups, sub groups and super groups....well thought out @Greg_Deckler

Anonymous

edit: I retract my earlier statement, I had a cardinality error in the relationships that I had missed even though I triple checked prior to posting. The original formula does work properly.

 

@Greg_Deckler I'm following along in Pwr BI and found a typo in at least on of your formulas. Above you have:

 

AverageScore = CALCULATE(AVERAGE(AttributeScores[Score]),RELATEDTABLE(AttributeScores))

When in fact the related table should be 'Attributes2Groups' no the actual Attribute table.

 

Cheers,

-R