Re: Best design choice for multiple granularity

wilson_smyth · ‎09-24-2018

I have a data model which i need to report on at multiple granularities.

Below is an example query that will populate the fact table with ID's from each dimension.


select o1.id, o2.id, o3.id
from o1
join o1o2Bridge on o1.id = o1o2Bridge.id1
join o2 on o1o2Bridge.id2 = o2.id
join o2o3Bridge on o2.id = o2o3Bridge.id2
join o3 on o3.id = o2o3Bridge.id3

This works fine if all tables have data and associated bridging data. But if each row in O1 does not have an associated row in O3, the row wont appear in the fact table.

e.g. O1.ID = 2 wont be in fact table as it is not associated with any row from O2, same for O1.ID = 3

Assume i want to get counts of O2 rows linked to O1 rows, noting that not all O2 rows will be linked to an O3 row

Id like to hear if i am neglecting to consider any options.

My options as i see them are:

- use outer joins

pull all data, add a row to O3 called "no linked O3" & link any O2 rows to this dimension entry that dont have an associated Row3. similar to the practice of adding "unknown" to a dimension to handle nulls.

Im not a fan of this as if im doing basic counts, they will appear incorrect.
e.g. For O1.ID = 2, a count of O2 will show a value of 1, even though the value should be zero.
e.g. if i do a count of O3 under O2 objects, i will still get a value of 1 for Obj2.id = 3.
I know i can correct this in dax but it seems a lot of work.

-Multiple fact tables.

Have a fact table showing O1 & O2 & a fact table showing O1, O2 & O3. This way two granularies are accounted for, but im now reusing almost the same query twice, and running it twice. Model is now also larger and more complex

Im still leaning towards this.

Would appreciate any input on the best practices in this situation.

v-danhe-msft · ‎09-25-2018

Hi @wilson_smyth,

Based on my research, I suggest you to use the merge funciton in query editor:

Referenc:http://radacad.com/append-vs-merge-in-power-bi-and-power-query

Also you could add an index column in each table, then you could merge them by the index.

Regards,

Daniel He

Community Support Team _ Daniel He
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

wilson_smyth · ‎09-26-2018

Thansk for the reply.

Merging/appending is not the issue. obtaining the data is easy via merge/append or just modifying my source queries. Im asking about design practices, specifically:

- when there are many shared dimensions, is it better to have seperate fact tables at the desired granularity so that i dont have to deal with nulls when a row does not exist at a higher granularity

OR

should i have 1 single fact table, and wherever there are NULLS due to data not available at a lower granularity, populate it with the "unknown" value that i populate into the dimensions & deal with it in dax code.