Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Anonymous
Not applicable

CALCULATE() with Filter() on many-side table seems to ignore relelationship direction

Hi,

 

I've posted this question before but for some reason it was only applied to the Spanish forum and not the English one (I have checked), so here goes another try - and my apologies if 2 of the same questions should appear.

 

The original issue

I have come across a case where I could use some help to understand the behavior of the CALCULATE(..., FILTER()) function, where the FILTER() statement is filtering the table on the many-side of a one-to-many relationship and simply returning all available records (IDs in this case).

 

Consider the following case:

A very simple setup with 2 tables to track the progress of the project, with a table containing the budget hours for each project '(Project Hours Budget') and a table containing the actual reports of the time spent on the projects ('Project hours spent'). In these tables there is a one-way relationship that goes from 'Project Hours Budget' to 'Project Hours Spent'.

 

Udklip.PNG

In this configuration I would like to create a measure that calculates the sum of the budget hours for the projects that have started (where the actual time spent has been reported).

 

This (apparently) can be done using the following measure:

 

 

 

 

Budget Hours Measure =
    CALCULATE(
        SUM('Projects Hours Budget'[Budget Hours]),
        FILTER(
            'Project Hours Spent',
            'Project Hours Spent'[Project ID]
        )
    )

 

 

 

 

 

The question:
I wonder why this works in the first place, since there is no relationship that implies that 'Project Hours Spent' should filter 'Project Hours Budget', which is the table we are doing a SUM() on. The relationship indicates that the Budget table can filter the Hours Spent table, but not the other way around. Also, if I remove the relationship between the tables the result changes (to yield the total SUM as I would've guessed it would be in the first place), so there is no question that the relationship is being used implicitly in some way - I just don't know why / how, and I haven't been able to find any source that indicates this behaviour.

 

Basically, I assume the engine is using the existing relationship to identify related columns and then filter the Budget Hours table using the Project ID's from the Hours Spent table - and in doing so ignoring the existing relationship direction. I just hope someone could confirm this behaviour and hopefully link me to some documentation or alike.

 

Below is a picture of the results obtained by using the above measure and using a simple SUM(), as well as an overview of the content of each table:

Udklip2.PNG

EDIT: based on my comment further down I realised that including the queries from DAX Studio would be ideal, so here they come:

 

1st query:

DEFINE TABLE '$TTable2' := 
SELECT
	SIMPLEINDEXN ( 'Projects Hours Budget'[Project ID] )
FROM 'Project Hours Spent'
	LEFT OUTER JOIN 'Projects Hours Budget' ON 'Project Hours Spent'[Project ID]='Projects Hours Budget'[Project ID]
WHERE
	 ( PFCASTCOALESCE ( 'Project Hours Spent'[Project ID] AS  INT ) <> 0 ) ;

 

2nd query:

DEFINE TABLE '$TTable1' := 
SELECT
	SUM ( 'Projects Hours Budget'[Budget Hours] )
FROM 
	'Projects Hours Budget'
WHERE
	'Projects Hours Budget'[Project ID] ININDEX '$TTable2'[$SemijoinProjection];

 

 

 

10 REPLIES 10
v-rzhou-msft
Community Support
Community Support

Hi @Anonymous 

Here I build a New Query Named Query3 comparing with other two Querys you build before.

 

Query 1 = 
    CALCULATE(
        SUM('Project Hours Budget'[Budget Hours]),
        FILTER(
            'Project Hours Spent',
            'Project Hours Spent'[Project ID]
        )
    )
Query 2 = 
SUM('Project Hours Budget'[Budget Hours])
Query 3 = 
    CALCULATE(
        SUM('Project Hours Budget' [Budget Hours]),
        FILTER(
            'Project Hours Budget',
            'Project Hours Budget'[Project ID] in VALUES('Project Hours Spent'[Project ID])
        )
    )

 

Query 3 will return to the Sum of Budget Hours in the Filtered Project Hours Budget Table. (ProjectID = 1,2,5,7 All IDs are in Project ID in 'Project Hours Spent' Table )

The Filtered Project Hours Budget Table returns as below.

1.png

This is same as Query1 (Build relationships)

In Query 1 if you build relationship, the query will work as Query3, get all Project IDS in 'Project Hours Spent' Table,

and use these IDs to filter Project Hours Budget Table. The Filtered Project Hours Budget Table returns is the same as above.

So the result is 10+15+70+35 = 130. 

If we remove the relationship, Query1 will work like Query2. Due to two tables are not related, so filter Project ID in 'Project Hours Spent' Table will not impact  Project Hours Budget Table.

So the Filtered Project Hours Budget Table returns itself:

2.png

And Query1 = 10+15+8+9+70+40+35 =187.

Result is as below.

Build relationship:

3.png

Remove relationship:

4.png

You can download the pbix file from this link: CALCULATE() with Filter() on many-side table seems to ignore relelationship direction

 

Best Regards,

Rico Zhou

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly. 

Anonymous
Not applicable

Hi @v-rzhou-msft 

 

Thank you for the detailed answer - however, I'm still missing an explanation as to why the engine uses the relationship in the first place, given that the direction of the relationship is from [Projects Budget Hours] to [Project Hours Spent]. This naturally implying that [Projects Budget Hours] should be able to filter [Project Hours Spent]  - but not the other way around?

 

Considering the direction of the relationship I would expect the Query 1 result to always be the same as Query 2?

Hi @Anonymous 

Could you tell me if your problem has been solved? If it is, kindly Accept it as the solution. More people will benefit from it. Or you are still confused about it, please provide me with more details about your problem or share me with your pbix file from your Onedrive for Business.

 

Best Regards,

Rico Zhou

Anonymous
Not applicable

Hi @v-rzhou-msft 

 

Thank you for your answer - I've only just had the time to read the articles as well, sorry. I believe I've wrapped my head around the logic going on based on your answer and the article regarding expanded tables.

 

To recap my conclusion and hopefully to have it verified/disproved:

 

As i first thought, the direction of the relationship is in fact disregarded in this calculation, as the direction is going from [Budget] to [Spent] - and I'm trying to calculate the [Budget Hours] with a filter based on 'Project Hours Spent'[Project ID]. However, as the calculation is done using DAX, an expanded table is implicity created based on the existing relationships - and this expanded table doesn't consider the direction of the relationship. 

 

This expanded table is equal to the one you've posted an image of, and the filtering is in fact done on this expanded table - this implies that the expanded table is limited to the rows where 'Projects Hours Budget'[Project ID] = 'Project Hours Spent'[Project ID] (or is it simply only the rows where there is a value in 'Projects Hours Budget'[Project ID]?)

Hi @Anonymous 

If you build relationships between Project Hours Budget and Project Hours Spent by Porject ID.

The direction of One to many is always from one the many. 

And in engine there is an expand table as below.

 1.png

So the Query 1 returns to 130, when you build relationships.

Remove relationships there is Table A only.

2.png

So the filter in Table B makes no sense, Query return to 180.

For more infor about relationships and expand table, you may refer to these blogs:

expanded-tables-in-dax 

What is the Direction of Relationship in Power BI?

 

Best Regards,

Rico Zhou

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly. 

amitchandak
Super User
Super User

@Anonymous , As this is, looking for true and false. It might restrict to peoject ID, present in hours table. So I think you are getting budget for project available in hours

 

 

Please provide your feedback comments and advice for new videos
Tutorial Series Dax Vs SQL Direct Query PBI Tips
Appreciate your Kudos.

Anonymous
Not applicable

Hi @amitchandak and thank you for the swift reply.

But what I'm wondering is why the relationship is being used in the first place, as Power BI is obviously filtering the 'Budget Hours' table based on the 'Hours Spent' table returned from the FILTER() statement - and in the same time seemingly disregards the fact that the relationship between those tables are a "one to many" going from the 'Budget Hours' table to the 'Hours Spent' table.

 

Why is this relationship used in the first place?

@Anonymous , Power bi default is a right join. So obe use anything from the fact it will be considered

Anonymous
Not applicable

@amitchandak could you elaborate on how that affects the relationship in this case?

Anonymous
Not applicable

I'm feeling a little foolish that I didn't do this prior to creating the question, but below is the queries for the calculation, fetched from DAX Studio (please note that this doesn't answer my primary question though):


1st query:

DEFINE TABLE '$TTable2' := 
SELECT
	SIMPLEINDEXN ( 'Projects Hours Budget'[Project ID] )
FROM 'Project Hours Spent'
	LEFT OUTER JOIN 'Projects Hours Budget' ON 'Project Hours Spent'[Project ID]='Projects Hours Budget'[Project ID]
WHERE
	 ( PFCASTCOALESCE ( 'Project Hours Spent'[Project ID] AS  INT ) <> 0 ) ;

 

2nd query:

DEFINE TABLE '$TTable1' := 
SELECT
	SUM ( 'Projects Hours Budget'[Budget Hours] )
FROM 
	'Projects Hours Budget'
WHERE
	'Projects Hours Budget'[Project ID] ININDEX '$TTable2'[$SemijoinProjection];

 

So, what it does is, that it creates a temp table with the values in 'Projects Hours Budget'[Project ID] that also exists in 'Project Hours Spent'[Project ID]'  and then uses this temp table to filter the 'Projects Hours Budget' table and afterwards finding the sum of all the [Budget Hours]

 

However, this still leaves my question open, as to why Power BI chooses to do this filter / join in the first place?

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.