Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
mim
Advocate V
Advocate V

left outer join using dax, Multiple to Multiple

I have two tables in my data model, currently,  i am exporting them to Excel do the merge there using PQ and import back to PowerBI Data model, as you would imagine, this is not efficient.

 

i have read about the new " naturalleftoutjoin ", i can't make it to works, if I add the common id, it complains that there is a duplicate columns, if i remove it, it say there is no common column, clearly i don't understand how it works.

so here is my scenario; notice the id is duplicated in both tables and it is by design, so any idea how to create the calculated table Result using DAX ?

 

left join outer.PNG

3 ACCEPTED SOLUTIONS

Just playing around with CROSSJOIN and UNION as well and maybe a GENERATE using the nonexisting and ROW to provide space for the BLANK value 🙂 but now I have to take some sleep



Did I answer your question? Mark my post as a solution, this will help others!

Proud to be a Super User!
I accept Kudos 😉
Hamburg, Germany

View solution in original post

OwenAuger
Super User
Super User

Using DAX you can do something like this, but M is probably preferable.

 

Result =
GENERATEALL (
    'Table 1',
    VAR Table1ID = 'Table 1'[id]
    RETURN
        SELECTCOLUMNS (
            CALCULATETABLE ( 'Table 2', 'Table 2'[id] = Table1ID ),
            "price", 'Table 2'[price]
        )
)

Owen Auger
Did I answer your question? Mark my post as a solution!
Blog
Twitter
LinkedIn

View solution in original post

@mim

 

Something like this should work. I tested it in a dummy model with physical tables Transformed_TARTostr and 'Table 2' and it worked for me.

 

Note that I've left you VAR Table1... unchanged except that I've been pedantic and qualified all your column names from Transformed_TAR and Tostr with their table names, e.g. [tag] becomes Transformed_Tar[tag] :

 

Result =
VAR Table1 =
    UNION (
        SELECTCOLUMNS (
            FILTER (
                Transformed_TAR,
                Transformed_TAR[current] = "yes"
                    && Transformed_TAR[rem_qty] <> 0
                    && Transformed_TAR[project phase] = "cons"
                    && Transformed_TAR[P6 ACTIVITY ID] <> BLANK ()
            ),
            "tag", Transformed_TAR[tag],
            "id", Transformed_TAR[P6 ACTIVITY ID],
            "subscan", Transformed_TAR[subscan],
            "subsystem", Transformed_TAR[TOSTR_Subsystem],
            "area", Transformed_TAR[Transformed Area],
            "weight", Transformed_TAR[weight],
            "drawing", Transformed_TAR[drawing],
            "phase", Transformed_TAR[project phase],
            "CREW", Transformed_TAR[crew],
            "Module", Transformed_TAR[Module],
            "rem_qty", Transformed_TAR[rem_qty]
        ),
        SELECTCOLUMNS (
            FILTER ( Tostr, Tostr[remaining_ITR] <> 0 && Tostr[P6 activity id] <> BLANK () ),
            "tag", Tostr[tag],
            "id", Tostr[P6 activity id],
            "subscan", Tostr[subscan],
            "subsystem", Tostr[SUBSYSTEM],
            "area", Tostr[Area],
            "weight", Tostr[weight],
            "drawing", Tostr[SHEET],
            "phase", Tostr[Phase],
            "CREW", Tostr[crew],
            "Module", Tostr[Module],
            "rem_qty", Tostr[remaining_ITR]
        )
    )
RETURN
    GENERATEALL (
        Table1,
        VAR Table1ID = [id]
        RETURN
            SELECTCOLUMNS (
                CALCULATETABLE ( 'Table 2', 'Table 2'[id] = Table1ID ),
                "price", 'Table 2'[price]
            )
    )

Owen Auger
Did I answer your question? Mark my post as a solution!
Blog
Twitter
LinkedIn

View solution in original post

24 REPLIES 24
OwenAuger
Super User
Super User

Using DAX you can do something like this, but M is probably preferable.

 

Result =
GENERATEALL (
    'Table 1',
    VAR Table1ID = 'Table 1'[id]
    RETURN
        SELECTCOLUMNS (
            CALCULATETABLE ( 'Table 2', 'Table 2'[id] = Table1ID ),
            "price", 'Table 2'[price]
        )
)

Owen Auger
Did I answer your question? Mark my post as a solution!
Blog
Twitter
LinkedIn
Anonymous
Not applicable

Thx @OwenAuger, that helped me a lot !

@OwenAuger intially I used crossjoin with filter, and it was a little slow, around 1 minutes, than i tried your approach !!!!! less than a second.

 

when you have time, do you mind explaining your appraoch.

 

thanks again.

Hi @mim

 

Quoting from The Definitive Guide to DAX:

"you might consider the GENERATE function similar to the CROSS APPLY condition in SQL, whereas GENERATEALL is similar to OUTER APPLY in SQL".

 

GENERATE/GENERATEALL iterate over the rows of first table specified and determine the second table in the context of each row, with freedom as to how the condition relates to that row context.

 

Performance-wise, I would imagine they perform better than a FILTER(CROSSJOIN()) pattern when the join results in a small subset of the rows of the CROSSJOIN of the two tables.

 

I haven't really played with NATURALLEFTOUTERJOIN or NATURALINNERJOIN, but I note they require join columns to have the same lineage.

 

Owen 🙂


Owen Auger
Did I answer your question? Mark my post as a solution!
Blog
Twitter
LinkedIn

@OwenAuger thanks for the explanation, i have been using your solution for the last two weeks and it is working perfectly.

 

currently table 1 is a calculated table,  I tried instead of making Table 1 as a variable using the same expression to built Table 1, but could not  make it to works, do you think it is possible or generateall required Table 1 to be physical ?

@mim

GENERATEALL should work even if Table 1 is not physical.

The expression we had didn't rely on the lineage of any columns in Table 1, so it should be possible to use it with Table 1 as a variable.

 

The only difference is when referring to a column of Table 1 (as a variable) in the row context of Table 1, you don't need to qualify the column name with a table name. So rather than 'Table 1'[id] you would just use [id].

 

Can you post the new expression you are trying to write, with detail on any phyiscal tables that Table 1 depends on?


Owen Auger
Did I answer your question? Mark my post as a solution!
Blog
Twitter
LinkedIn
Anonymous
Not applicable

Hi @OwenAuger ,

Can Table2 be Variable like Table 1. I have defined Table 1 and table 2 are defined in the same DAX window but Generateall is not working but it is working If one of the table is a physical table.

Could you please provide a sample with being Table 1 an dTable are declared as variables.

@OwenAuger thanks , Table 1 is is based on this expression

 

var Table1= union( SELECTCOLUMNS(filter(Transformed_TAR,[current]="yes"&&[rem_qty]<>0&&[project phase]="cons"&&[P6 ACTIVITY ID]<>BLANK()),"tag",[tag],"id",[P6 ACTIVITY ID],"subscan",[subscan],"subsystem",[TOSTR_Subsystem],"area",[Transformed Area],"weight",[weight],"drawing",[drawing],"phase",[project phase],"CREW",[crew],"Module",[Module],"rem_qty",[rem_qty]),SELECTCOLUMNS(filter(Tostr,[remaining_ITR]<>0&&[P6 activity id]<>BLANK()),"tag",[tag],"id",[P6 activity id],"subscan",[subscan],"subsystem",[SUBSYSTEM],"area",[Area],"weight",[weight],"drawing",[SHEET],"phase",[Phase],"CREW",[crew],"Module",[Module],"rem_qty",[remaining_ITR]))

 

the id = "id"

 

cheers

Mim

@mim

 

Something like this should work. I tested it in a dummy model with physical tables Transformed_TARTostr and 'Table 2' and it worked for me.

 

Note that I've left you VAR Table1... unchanged except that I've been pedantic and qualified all your column names from Transformed_TAR and Tostr with their table names, e.g. [tag] becomes Transformed_Tar[tag] :

 

Result =
VAR Table1 =
    UNION (
        SELECTCOLUMNS (
            FILTER (
                Transformed_TAR,
                Transformed_TAR[current] = "yes"
                    && Transformed_TAR[rem_qty] <> 0
                    && Transformed_TAR[project phase] = "cons"
                    && Transformed_TAR[P6 ACTIVITY ID] <> BLANK ()
            ),
            "tag", Transformed_TAR[tag],
            "id", Transformed_TAR[P6 ACTIVITY ID],
            "subscan", Transformed_TAR[subscan],
            "subsystem", Transformed_TAR[TOSTR_Subsystem],
            "area", Transformed_TAR[Transformed Area],
            "weight", Transformed_TAR[weight],
            "drawing", Transformed_TAR[drawing],
            "phase", Transformed_TAR[project phase],
            "CREW", Transformed_TAR[crew],
            "Module", Transformed_TAR[Module],
            "rem_qty", Transformed_TAR[rem_qty]
        ),
        SELECTCOLUMNS (
            FILTER ( Tostr, Tostr[remaining_ITR] <> 0 && Tostr[P6 activity id] <> BLANK () ),
            "tag", Tostr[tag],
            "id", Tostr[P6 activity id],
            "subscan", Tostr[subscan],
            "subsystem", Tostr[SUBSYSTEM],
            "area", Tostr[Area],
            "weight", Tostr[weight],
            "drawing", Tostr[SHEET],
            "phase", Tostr[Phase],
            "CREW", Tostr[crew],
            "Module", Tostr[Module],
            "rem_qty", Tostr[remaining_ITR]
        )
    )
RETURN
    GENERATEALL (
        Table1,
        VAR Table1ID = [id]
        RETURN
            SELECTCOLUMNS (
                CALCULATETABLE ( 'Table 2', 'Table 2'[id] = Table1ID ),
                "price", 'Table 2'[price]
            )
    )

Owen Auger
Did I answer your question? Mark my post as a solution!
Blog
Twitter
LinkedIn
Anonymous
Not applicable

@OwenAuger i have similar needs and Table 2 is virtual table too, need to calculate as VAR Table2 = ..... But it's not working in GENERATEALL. Any advise?

Thanks 

@Anonymous

Could you post more detail on your particular tables & expected result?


Owen Auger
Did I answer your question? Mark my post as a solution!
Blog
Twitter
LinkedIn
Anonymous
Not applicable

I'm looking for solution to left out join 2 virtual tables according to filter context and thus find this thread with your excellent DAX. Instead of giving all new data, we can reuse Mim's case, can you replace Table 2 to virtual table? e.g. VAR table_2 = SELECTCOLUMNS('Table 2', "id", 'Table 2'[id], "price", 'Table 2'[price]). Then use it in GenerateAll. 

 

BTW, as you mentioned, i tried NATURALLEFTOUTERJOIN() but got data type/data linage issue. 

Thanks.

@Anonymous

If I understand you correctly, you should be able to do something like this, using FILTER instead of CALCULATETABLE.

Result =
VAR table_2 =
    SELECTCOLUMNS ( 'Table 2', "id", 'Table 2'[id], "price", 'Table 2'[price] )
RETURN
    GENERATEALL (
        'Table 1',
        VAR Table1ID = 'Table 1'[id]
        RETURN
            SELECTCOLUMNS ( FILTER ( table_2, [id] = Table1ID ), "price", [price] )
    )

I have assumed we start with the same physical tables as Mim's example and make table_2 a virtual table as you've described. Table 1 could also be a virtual table and it would make little difference.


Owen Auger
Did I answer your question? Mark my post as a solution!
Blog
Twitter
LinkedIn
Anonymous
Not applicable

Hi

I need to replace null\blank values from generateall ( null values from the Right Table ( Table2)) with 0 ( Zero).

 

Column1 ( from right Table) is whole number.

I have tried 

Column1+ 0

IF(isblank(Table2[Column1],0, Table2[Column1])

But it is still showing  Empty string\Blank Values.

 

Also, no calculations working on this Blank\Empty string. like Addition, subtraction,comparision on Blank Values. Calculation working if it has some values.

Could you please suggest solution?

Anonymous
Not applicable

Hi @OwenAuger 

I need to replace null\blank values from generateall ( null values from the Right Table ( Table2)) with 0 ( Zero).

 

Column1 ( from right Table) is whole number.

I have tried 

Column1+ 0

IF(isblank(Table2[Column1],0, Table2[Column1])

But it is still showing  Empty string\Blank Values.

 

Also, no calculations working on this Blank\Empty string. like Addition, subtraction,comparision on Blank Values. Calculation working if it has some values.

Could you please suggest solution?

Anonymous
Not applicable

This has been solved by keeping the results into Selectedcolumns or storing in physical table( This Null values has been modified as blank() values)

Hi @OwenAuger

 

I have a similar issue for which I'm looking for a solution. It's actually a merge query but as I'm not allowed to use the 'merge' functionality from power query (scheduled refresh on power bi server does not support that) I have to use DAX syntax to mimic this.

 

Here is the situation:

 

- merge Table A, Table B and Table C into new table_merged

- I need all records from Table A

- Add matching records from Table B as new columns where tableA.version= tableB.version and tableA.version2=tableB.version2

- if there is no matching record from Table B:

  • first match Table C where tableA.version= tableC.version and tableA.version2=tableC.version2
  • match Table B where tableB.version= tableC.version3 and tableA.version2=tableC.version4

 

Hope the picture below makes it more clear 🙂

 

thanks in advance!

Muki

 

merge.png

Hi @Muki

It would certainly be much easier to do these merges in Power Query! 🙂

But if you have to do this in DAX, here is some sample code (which I'm sure can be improved upon by the way!)

I've assumed the tables are named TableA, TableB & TableC.

 

The last DISTINCT step may not be necessary...not sure. It doesn't matter for your sample data, but it could matter in cases where Table C is not required but merging with it in step MERGE_ABCB results in some un-needed duplicate rows.

 

MergedTable =
VAR Merge_AB =
    GENERATEALL (
        TableA,
        FILTER (
            TableB,
            TableB[tableB.version] = TableA[tableA.version]
                && TableB[tableB.version2] = TableA[tableA.version2]
        )
    )
VAR MERGE_CB =
    GENERATEALL (
        TableC,
        FILTER (
            TableB,
            TableB[tableB.version] = TableC[tableC.version3]
                && TableB[tableB.version2] = TableC[tableC.version4]
        )
    )
VAR MERGE_ABCB =
    GENERATEALL (
        Merge_AB,
        SELECTCOLUMNS (
            FILTER (
                MERGE_CB,
                TableC[tableC.version] = TableA[tableA.version]
                    && TableC[tableC.version2] = TableA[tableA.version2]
            ),
            "tableC.ID", TableC[tableC.ID],
            "tableC.startdate", tableC[tableC.startdate],
            "tableC.enddate", tableC[tableC.enddate],
            "tableC.version", tableC[tableC.version],
            "tableC.version2", tableC[tableC.version2],
            "tableC.version3", tableC[tableC.version3],
            "tableC.version4", tableC[tableC.version4],
            "tableB.ID_copy", tableB[tableB.ID],
            "tableB.startdate_copy", tableB[tableB.startdate],
            "tableB.enddate_copy", tableB[tableB.enddate],
            "tableB.version_copy", tableB[tableB.version],
            "tableB.version2_copy", tableB[tableB.version2]
        )
    )
VAR MERGE_CLEAN =
    SELECTCOLUMNS (
        MERGE_ABCB,
        "tableA.ID", tableA[tableA.ID],
        "tableA.startdate", tableA[tableA.startdate],
        "tableA.enddate", tableA[tableA.enddate],
        "tableA.version", tableA[tableA.version],
        "tableA.version2", tableA[tableA.version2],
        "tableB.ID", IF ( ISBLANK ( tableB[tableB.ID] ), [tableB.ID_copy], tableB[tableB.ID] ),
        "tableB.startdate", IF (
            ISBLANK ( tableB[tableB.startdate] ),
            [tableB.startdate_copy],
            tableB[tableB.startdate]
        ),
        "tableB.enddate", IF (
            ISBLANK ( tableB[tableB.enddate] ),
            [tableB.enddate_copy],
            tableB[tableB.enddate]
        ),
        "tableB.version", IF (
            ISBLANK ( tableB[tableB.version] ),
            [tableB.version_copy],
            tableB[tableB.version]
        ),
        "tableB.version2", IF (
            ISBLANK ( tableB[tableB.version2] ),
            [tableB.version2_copy],
            tableB[tableB.version2]
        ),
        "tableC.ID", [tableC.ID],
        "tableC.startdate", [tableC.startdate],
        "tableC.enddate", [tableC.enddate],
        "tableC.version", [tableC.version],
        "tableC.version2", [tableC.version2],
        "tableC.version3", [tableC.version3],
        "tableC.version4", [tableC.version4]
    )
VAR MERGE_CLEAN_DISTINCT =
    DISTINCT ( MERGE_CLEAN )
RETURN
    MERGE_CLEAN_DISTINCT  

 


Owen Auger
Did I answer your question? Mark my post as a solution!
Blog
Twitter
LinkedIn

This is amazing, big thanks! @OwenAuger

TomMartens
Super User
Super User

Hey,

 

this is odd, I created the screenshot from this

 

little example

 

recreating your sample data, a table merge using left outer creates this result

LeftOuter.png

 

Really wondering whats going on



Did I answer your question? Mark my post as a solution, this will help others!

Proud to be a Super User!
I accept Kudos 😉
Hamburg, Germany

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.