cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Highlighted
Super User III
Super User III

Doubts with INNERJOINs and lineage

Hi all,

 

I am reading this article on joining tables in DAX and I am quite confused as to how these INNERJOINs work. In the section called "Using NATURALLEFTOUTERJOIN and NATURALINNERJOIN without Relationships" the authors claim:

 

You cannot join these two tables by using ProductKey, because such a these columns havehas the same name but different data lineages in the model. In fact, the following code generates an error:

EVALUATE
NATURALLEFTOUTERJOIN( P_A, P_B )

 

Ok, so from that I understand that the columns need a common lineage. But then a few lines later they state:

 

In order to join two columns with the same name and no relationships, it is necessary that these columns do not have a data lineage. To obtain that, it is necessary to write the column using an expression that breaks the data lineage, as in the following example.

 

So with this it seems we have to break the lineage (??). Later, again, the example with the TREATAS seems to be used to create rather than break the lineage. This is where I'm getting lost. Is it break or create lineage that you need for the INNERJOINS to work?

If anyone knows about this I would really appreciate it if you can clarify it for me, or direct me to other resources where I can read about it. I've checked it in another book but it doesn't really go into the details.


Many thanks for your help.

 

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted
Super User III
Super User III

Re: Doubts with INNERJOINs and lineage

@AlB

Man... you ask very difficult questionsSmiley Wink

 

My limited understanding is following

 

1) InnerJoin can be created between tables that do not physically exist in the data model. In that case the tables should have a column with same name and same type. For example you can create temporary tables using variables using DATATABLE, Row or other such functions

2) When using INNERJOIN with Tables existing in the Data Model, there should already be a relationship beween these tables.


If there there is no relationship , we can still use INNERJOIN
                                but first
we need to remove the data lineage (i.e. the tag which identifies the original column in the data model that the values of a column originated from). Otherwise it can create confusion querying physical tables in the data model.


This tag is not removed simply by changing the names of the column using SELECTCOLUMNS or any other simialr function. becasue DAX engine is smart enough to remember where the values originated from. Data lineage is maintained as long as an expression is only made up of one column reference. To beak that lineage we use techniques like adding empty string or a 0 to the column expression.

Try my new Power BI game Cross the River

View solution in original post

9 REPLIES 9
Highlighted
Resident Rockstar
Resident Rockstar

Re: Doubts with INNERJOINs and lineage

Hi @AlB,

 

Hope the blogs below will help you.

 

https://curbal.com/blog/glossary/naturalleftouterjoin-dax

https://curbal.com/blog/glossary/naturalleftouterjoin-dax

 

In addition, we could use DAX Studio trace the dax query to analyze the query better.

 

Best Regards,

Cherry

Community Support Team _ Cherry Gao
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Highlighted
Super User III
Super User III

Re: Doubts with INNERJOINs and lineage

Hi @v-piga-msft

 

Thanks for your reply. I checked the videos but, unfortunately,  they don't answer my question. In fact the speaker says that she doesn't know how the joins work with no relationships and she is asking for tips herself.

Maybe @Zubair_Muhammad can give us a hand here?

Many thanks

 

 

Highlighted
Super User III
Super User III

Re: Doubts with INNERJOINs and lineage

@AlB

Man... you ask very difficult questionsSmiley Wink

 

My limited understanding is following

 

1) InnerJoin can be created between tables that do not physically exist in the data model. In that case the tables should have a column with same name and same type. For example you can create temporary tables using variables using DATATABLE, Row or other such functions

2) When using INNERJOIN with Tables existing in the Data Model, there should already be a relationship beween these tables.


If there there is no relationship , we can still use INNERJOIN
                                but first
we need to remove the data lineage (i.e. the tag which identifies the original column in the data model that the values of a column originated from). Otherwise it can create confusion querying physical tables in the data model.


This tag is not removed simply by changing the names of the column using SELECTCOLUMNS or any other simialr function. becasue DAX engine is smart enough to remember where the values originated from. Data lineage is maintained as long as an expression is only made up of one column reference. To beak that lineage we use techniques like adding empty string or a 0 to the column expression.

Try my new Power BI game Cross the River

View solution in original post

Highlighted
Super User III
Super User III

Re: Doubts with INNERJOINs and lineage

Hi  @Zubair_Muhammad

Thanks very much for your reply. I guess it's a good sign that the questions are difficult Smiley Wink

 

Your tips were very useful, particularly the difference between physical tables and those created dynamically. After testing a few things my conclusion is that:

 

1. Tables created dynamically from scratch, such as with DATATABLE or the table constructor:  { ("a", 1) , ("b", 2) }, do not have that lineage tag.

2. Columns in physical tables always have a lineage tag. 

 

In the first case  there's no lineage so there's no problem for the joins ; it seems like "no lineage" on both columns is considered "same lineage" by the engine. In the second, there's lineage and it's different. It needs to be removed so that we have two columns with no lineage that can now be joined.

What do you think?

 

Thanks a lot

 

Highlighted
Super User III
Super User III

Re: Doubts with INNERJOINs and lineage

@Zubair_Muhammad

 

And since you like tough questions Smiley Happy I have something else regarding the same article on SQLBI.

(Disclaimer: I have posted this question there as well, in case you prefer to wait for their answer. That might be soon or take quite long, depending on how busy they are). 

 

They show an interesting way of creating common lineage through TREATAS before joining the tables:

 

EVALUATE
VAR B_TreatAs =
TREATAS ( P_A, P_B[ProductKey], P_A[Code], P_A[Color] )
VAR Result =
NATURALLEFTOUTERJOIN ( B_TreatAs, P_B )
RETURN
Result

 

I have two variations of this code and corresponding questions:

 

1. The following does not work:

 

EVALUATE
VAR Result =
CALCULATETABLE(NATURALLEFTOUTERJOIN ( P_A, P_B ), TREATAS ( P_A, P_B[ProductKey], P_A[Code], P_A[Color] ))
RETURN
Result

 

Here we perform the TREATAS in an argument of the CALCULATETABLE. We are providing the CALCULATETABLE with a table (result of the TREATAS) with the adequate lineage changes. Any idea why this wouldn't work?

 

2. This does not work either

 

EVALUATE
VAR B_TreatAs =
TREATAS ( P_A, P_B[ProductKey], P_A[Code], P_A[Color] )
VAR Result =
NATURALLEFTOUTERJOIN ( P_A, P_B )
RETURN
Result

 

The only difference with the initial code we are using P_A as the first argument of the INNERJOIN instead of the variable B_TreatAs. Here is my reasoning as to why this should work:
TREATAS properly modifies the lineage when the variable B_treatAs is created. So P_B[ProductKey] has already "received" the lineage of P_A[ProductKey]. Once that has happened, the Innerjoin should work with both P_A or B_TreatAs.

What is wrong with my reasoning? Or is it that the variable B_TreatAs is actually not created if the variable is not used/consumed?

 

Many thanks

Highlighted
Super User III
Super User III

Re: Doubts with INNERJOINs and lineage

Hi AIB

Will get back to you.
Will have to look into this topic.
Try my new Power BI game Cross the River
Highlighted
Advocate II
Advocate II

Re: Doubts with INNERJOINs and lineage

Highlighted
Super User I
Super User I

Re: Doubts with INNERJOINs and lineage

@AlB  thanks for this great question !!!

@Zubair_Muhammad  thanks for the amazing explanation !!!

Highlighted
Post Prodigy
Post Prodigy

Re: Doubts with INNERJOINs and lineage

@Zubair_Muhammad 

Just wanted to say thanks. Adding a text value to the column values solved my issue. 

Without the "C", the below didn't worked :). 

 

NATURALINNERJOIN(
   SELECTCOLUMNS(
      filter (
         'Model N Query'
         , 'Model N Query'[Date check product] = "Current"
            && 'Model N Query'[Date check header] = "Current"
            && 'Model N Query'[Doc Status Desc] = "Implemented"
            && 'Model N Query'[Sub Type Desc] = "Adjusted Cost"
         )
      , "InnerJoin" , "C" & 'Model N Query'[Contract Numb]
      , "Product" , 'Model N Query'[PROD_NUM]
      , "Currency" , 'Model N Query'[Tier1 Curr]
      )
   , SELECTCOLUMNS(
      filter (
         'Inclusion query'
         , 'Inclusion query'[Date check inclusion] = "Current"
         )
      , "InnerJoin" , "C" & 'Inclusion query'[Contract ID]
      , "Customer" , 'Inclusion query'[Customer ID]
      , "Wholesaler" , 'Inclusion query'[Wholesaler Name]
      )
   )

Helpful resources

Announcements
Meet the 2020 Season 2 Power BI Super Users!

Meet the 2020 Season 2 Power BI Super Users!

Find out who's part of the program this season, and welcome the new Super Users.

August 2020 Community Challenge: Can You Solve These?

August 2020 Community Challenge: Can You Solve These?

We're excited to announce our first cross-community 'Can You Solve These?' challenge!

July 2020 Community Highlights

July 2020 Community Highlights

Learn about the exciting things that happened in July.

Featured Data Story of The Month

Featured Data Story of The Month

All Data Stories Gallery contributions are reviewed for each month. We select a contribution and feature the community member the following month.

Power BI Dev Camp - Developing with .NET Core

Power BI Dev Camp - Developing with .NET Core

Learn how to develop custom web applications for Power BI using .NET Core 3.1 and .NET 5.

Top Solution Authors
Top Kudoed Authors