Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
fchukoskie
Advocate I
Advocate I

Star Schema cross-join

I'm creating my first PowerBI dashboard, using DirectQuery against an Azure SQL Data Warehouse instance, and am seeing strange cross-joins.

 

When I add 1 fact table and 2 dimensions to the model and test counts in a simple matrix, creating a "Count Distinct" measure on one of my key fields (a join key between the Fact & Dimension table) works as expected (though slowly) if I pull that field onto the matrix from the Fact table. So, it should be asking SQL for a distinct count of "fact.userid".

 

However I pull that same field onto the matrix from the Dimension table (dimension.userid), the Count Distinct looks like it's cross-joining to the other dimensions in the matrix (in this case, "Month" from the calendar dimension). Subsequent tests then show that any field from that dimension also cross-joins in the matrix.

 

The relationships to each dimension show as "Active" under the "Manage Relationships" table, but I can't figure out if there's a way to see the SQL that PowerBI sends to Azure in its DirectQuery to understand why it's counting incorrectly.

 

Am I handling the relationships wrong, or does PowerBI expect the model to be more "flat" for DirectQuery to work?

1 ACCEPTED SOLUTION

i don't think bidirectional will work with direct query mode

View solution in original post

11 REPLIES 11
fchukoskie
Advocate I
Advocate I

It's not Azure. Here's a sample matrix created from a DirectQuery against AdventureWorksDW.

 

Count (Distinct) of FactInternetSales.CustomerKey:FromFactTable.png

 

Count (Distinct) of DimCustomer.CustomerKey:

FromDimTable.png

Here's the model:

Model.png

Perform your counts on the fact table, or make sure that every relationship in the chain between two tables has the relationship cross filter direction set to 'Bi-directional'

 

 

Using the key field as an example was contrived. I really want to attributes from the dimension table in a star schema. If we can't, we'll just have to start authoring views to flatten everything (at which point, we probably pick a different tool).

 

As to the filter, how do I enable that for tweaking so that we can pick "bi-directional"?

RelationshipFilter.png

i don't think bidirectional will work with direct query mode

Disappointing, but that's reality sometimes. Thanks.

 

(maybe someday...)

https://ideas.powerbi.com/forums/265200-power-bi/suggestions/11434623-allow-bi-directional-filtering...

Well, direct query mode has intrinsic technical challenges since the data do not live in the model.  I will gladly trade directional relationships for ability to enhance direct query mode based models with calculations.

What is your specific reporting need where you need to have matrices without measures based on your fact? There might be some convenient workarounds similar to my post above to cover those.

 

If it's all tabular reporting without hitting the fact table, then Power BI is not the tool for you. Power BI is specifically optimized for analytical reporting, and not as a fancy UI for a SQL query generator (though Power Query can function as such).

Fair question on requirements, but I'm really making this up as I go. We're still getting data cleansed & standardized in an Azure SQL Data Warehouse, so I'm trying to evaluate options on BI & Reporting tools in parallel as time allows...which means that there was no requirement for matrices of data. It was just what I was doing that day while exploring options for self-service BI that might mimic what we could get from ROLAP tools like MicroStrategy or Cognos.

@fchukoskie, understood. Hopefully my explanations helped you understand the behavior better. I would recommend playing with Power BI some more with this new understanding. The relationship semantics are a bit unfamiliar at first, but it turns out they encourage design patterns that end up making the reporting and analysis pieces very intuitive even for non-technical end users.

 

Ultimately, Power BI is a very flexible suite of tools that deserves attention, even if it's not the solution you choose.

@greggyb It made perfect sense, thanks. Long-term, it's looking like a good fit for the times when I want something like QlikView, but less so for the times when I need MicroStrategy/Cognos ROLAP like behavior.

greggyb
Resident Rockstar
Resident Rockstar

Ah, I did not realize that the cross-filter direction is not an option for direct query.

 

With single direction relationships, everything is a pseudo-left join from the many side of the relationship to the one side. There's no way to force this to behave as an inner join.

 

For display purposes, the query generated is a cross-join of all tables whose fields are included. Each row of the resulting table has the measure(s) in the visualization evaluated for the combination of values in that row. Rows which have a blank after evaluating the measure (either because the combination doesn't exist in the fact or a conditional returned blank) are then filtered out of the display.

This is modeled on the behavior of Excel pivot tables and SSAS Multidimensional.

 

To force the relationship to be evaluated between two tables you must evaluate a measure. In a star schema, that measure must be filtered by the fact table.

 

If you want to write a measure based on something in the dimension, well then that's not a dimension, strictly speaking. Nonetheless dimension row counts make a great denominator for lots of measures, so we'd usually like to do this.

 

I'm not sure about the subset of DAX allowed in direct query mode, but I know it's an area still getting attention.

 

The following DAX utilizes cross table filtering to force the dimension to be evaluated in the context of the fact table:

 

DimensionRowCount =
CALCULATE(
    COUNTROWS( '<dimension>' )
    ,'<fact>'
)

This is the same format as the N:N workaround in SSAS Tabular < 2016, which makes sense, because a star schema is essentially a bunch of N:N relationships between dimensions with the fact as a gigantic bridge table.

 

If you just want to use Power BI as a tabular reporting interface:

  1. You're massively under-utilizing the tool
  2. You'll need to include, at minimum, a COUNTROWS( '<fact>' ) in the matrix / table to force the dimensions to play nicely together with relationships to the fact table

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.