Solved: Re: Power Query Join tables

sdjensen · ‎07-01-2016

Hi,

I need some assistance with joining 2 tables in Query Editor. What I want is for each row in table 2 to return rows from table 1 where Table1.”Finanskonto Nummer” >= Table2.”Totaling.1” and Table1.”Finanskonto Nummer” <= Table2.”Totaling.2”

Table1:

Table2:

Wanted Result:

/sdjensen

sdjensen · ‎07-02-2016

Okay I got a working solution now... thanks to @ImkeF and @Greg_Deckler for all their time and help.

@ImkeF - it's strange that your trick doesn't work for you it worked for me? I would really like to accept your last code as a solution, but I haven't tested it, but I think our solutions are very close to each other.

this is my final code:

let
    // 1 til 1 relation for konti af type "konto"
    Source1 = Table.SelectColumns( Table.SelectRows( Finanskonto, each [FinanskontoTypeSort] = 0 ), {"FinanskontoKey"} ),
    Source1AddFinansKontoKeyIncluded = Table.AddColumn(Source1, "Included.FinanskontoKey", each [FinanskontoKey]), 
    
    // M til M relation for konti af type "sum" og "til-sum"
    Source2 = Table.SelectColumns( 
                Table.SelectRows( Finanskonto, each List.Contains({2, 4}, [FinanskontoTypeSort]) ), 
                {"FinanskontoKey", "Regnskab", "Finanskonto Nummer", "Totaling"}  
            ),
    Source2SplitByDelimiter = Table.SplitColumn(
                Source2,
                "Totaling",
                Splitter.SplitTextByDelimiter("|", QuoteStyle.Csv)
            ),
    Source2UnpivotedColumns = Table.UnpivotOtherColumns(
                Source2SplitByDelimiter, 
                {"FinanskontoKey", "Regnskab", "Finanskonto Nummer"}, "Attribute", "Totaling"
            ),
    Source2SplitByDelimiter2  = Table.SplitColumn(
                Source2UnpivotedColumns,
                "Totaling",
                Splitter.SplitTextByDelimiter("..", QuoteStyle.Csv),
                {"Totaling.1", "Totaling.2"}
            ),
    Source2ReplaceNulls = Table.ReplaceValue(Source2SplitByDelimiter2, null, each _[Totaling.1], Replacer.ReplaceValue,{"Totaling.2"}
            ),
    Source2AddedTotaling1Idx = Table.Join(
                Source2ReplaceNulls,
                {"Regnskab", "Totaling.1"},
                Table.PrefixColumns(Table.SelectColumns( Finanskonto, {"Regnskab", "Finanskonto Nummer", "Index"} ), "Idx1"),
                {"Idx1.Regnskab", "Idx1.Finanskonto Nummer"}     
            ),
    Source2AddedTotaling2Idx = Table.Join(
                Source2AddedTotaling1Idx,
                {"Regnskab", "Totaling.2"},
                Table.PrefixColumns(Table.SelectColumns( Finanskonto, {"Regnskab", "Finanskonto Nummer", "Index"} ), "Idx2"),
                {"Idx2.Regnskab", "Idx2.Finanskonto Nummer"}     
            ),
    Source2TotalingList = Table.AddColumn ( Source2AddedTotaling2Idx, "TotalingList", each {[Idx1.Index]..[Idx2.Index]} ),
    Source2ExpandedTotalList = Table.ExpandListColumn(Source2TotalingList, "TotalingList"),
    Source2TableJoin = Table.Join( 
                Source2ExpandedTotalList, 
                {"Regnskab", "TotalingList"},
                Table.PrefixColumns( Table.SelectColumns(Finanskonto, {"Regnskab", "Index", "FinanskontoKey"} ), "Included"),
                {"Included.Regnskab", "Included.Index"}
            ),    
    Source2RemoveOtherColumns = Table.SelectColumns(Source2TableJoin,{"FinanskontoKey", "Included.FinanskontoKey"}),
    
    CombineSource1and2 = Table.Combine({Source1AddFinansKontoKeyIncluded, Source2RemoveOtherColumns})
in
    CombineSource1and2

/sdjensen

View solution in original post

Greg_Deckler · ‎07-01-2016

My blog article on this may help out.

http://social.technet.microsoft.com/wiki/contents/articles/32915.power-bi-merge-query-with-m.aspx

@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!: The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

sdjensen · ‎07-01-2016

@Greg_Deckler - Not really. I am familiar with Table.Join and Table.NestedJoin however they need values in the two table to be equal each other I need a join where >= and <=

/sdjensen

sdjensen · ‎07-01-2016

I know that in my example all the columns I need from table 1 is equal to a value in table 2 totaling.1 or totaling.2 however there might aswell be a row in table 1 with the value 1145 which then also should be returned since it's between 1140 and 1190.

/sdjensen

Greg_Deckler · ‎07-01-2016

Also, what is the data source of the 2 tables? SQL, CSV, web?

@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!: The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

sdjensen · ‎07-01-2016

I would prefer a solution in "M" for 2 reasons. 1. I like to keep all transformations in "M", so I always only have one place to search for them, 2. I need to apply the result to another table. I know I will properly be able to solve this in DAX, but I challenged myself to do it with "M", but this step gives me some problems.

EDIT: The source is SQL, but I have no control over the source and can't make or request changes to the source.

/sdjensen

sdjensen · ‎07-01-2016

So far the code to the new table I am creating is this:

let
    // 1 til 1 relation for konti af type "konto"
    Source = Table.SelectColumns( Table.SelectRows( Finanskonto, each [FinanskontoTypeSort] = 0 ), {"FinanskontoKey"} ),
    AddFinansKontoKeyIncluded = Table.AddColumn(Source, "FinanskontoKeyIncluded", each [FinanskontoKey]), 
    
    // M til M relation for konti af type "sum" og "til-sum"
    Source2 = Table.SelectColumns( 
                Table.SelectRows( Finanskonto, each List.Contains({2, 4}, [FinanskontoTypeSort]) ), 
                {"FinanskontoKey", "Totaling"}  ),
    Source2SplitByDelimiter = Table.SplitColumn(
                Source2,
                "Totaling",
                Splitter.SplitTextByDelimiter("|", QuoteStyle.Csv),
                {"Totaling.1", "Totaling.2", "Totaling.3", "Totaling.4", "Totaling.5", "Totaling.6", "Totaling.7", "Totaling.8", "Totaling.9", "Totaling.10"}
            ),
    Source2UnpivotedColumns = Table.UnpivotOtherColumns(
                Source2SplitByDelimiter, 
                {"FinanskontoKey"}, "Attribute", "Totaling"
            ),
    Source2SplitByDelimiter2  = Table.SplitColumn(
                Source2UnpivotedColumns,
                "Totaling",
                Splitter.SplitTextByDelimiter("..", QuoteStyle.Csv),
                {"Totaling.1", "Totaling.2"}),
    Source2ReplaceNulls = Table.ReplaceValue(Source2SplitByDelimiter2, null, each _[Totaling.1], Replacer.ReplaceValue,{"Totaling.2"})

in
    Source2ReplaceNulls

/sdjensen

Greg_Deckler · ‎07-01-2016

Just to throw this out there, but this seems like it would be a relatively simple SQL query to conditionally join the tables in a SELECT statement (at least easier than in "M" but it is an interesting problem to solve so I'll take a look although someone like @ImkeF might be able to bang out the solution from memory.

@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!: The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

sdjensen · ‎07-01-2016

It would be very easy in SQL if the tables was like that in the source. However the Table2 I pasted is a result of the transformations I have posted earlier with splits and unpivots, so it's not a direct table from my source. The source is what I refer to in my query as "Finanskonto" and the 3 columns I pasted as table 1 is from this table.

/sdjensen

Greg_Deckler · ‎07-01-2016

Here's another thought, what if you treat this as a LOOKUP problem? I have some examples in this blog article at the bottom where no relationship is required between tables and such:

http://community.powerbi.com/t5/Community-Blog/Good-Ol-VLOOKUP-The-Ultimate-Guide-to-Lookups-in-Powe...

@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!: The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

ImkeF · ‎07-01-2016

Yes, instead of joining on fields/columns which requires exact matches one can use the technique described here: http://community.powerbi.com/t5/Desktop/Complex-PowerQuery-Merge-query-using-substrings/m-p/45582#M1...

It's basically a crossjoin and then filter on what you need. So you might experience some performance-issues. ... But after having a look at your code it seems as if you're dealing with dimension tables only, so not too serious probably. And you've reduced the number of columns already - good. So just make sure to use the buffer on the table that will be passed into the crossjoin.

An alternative approach would be to create a list of numbers {Totalling.1..Totalling.2}, expand and then perform a JoinKind.Inner.

What you're doing here looks like a decomposition of the NAV account-schedules, am I right? So it would be preferrable to have a dynamic list of column names in the split with the "|". This can be accomplished by simply omitting the specific definition of the column-names like this:

Source2SplitByDelimiter = Table.SplitColumn(
                Source2,
                "Totaling",
                Splitter.SplitTextByDelimiter("|", QuoteStyle.Csv)
            ),

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

sdjensen · ‎07-02-2016

@ImkeF: You are so right... I am creating a table to be used in a many-to-many relationship between my general ledger transaction and general ledger account dimension, so I can take advantage of NAV totaling column to create subtotals in my account dimension. This is how I normaly solve this when creating SSAS multi-dimentional cubes.

Thank you for your suggestion on omitting the specific definition of the column names in the Split by delimiter step.

My initial idea was to create a list from the 2 columns but I can't seem to find the right function to do this. Do you have any ideas?

I am not sure how I should use your crossjoin solution since I don't always know all the values to look up until I have created a list with all posible values between Totaling.1 and Totaling.2?

/sdjensen

sdjensen · ‎07-02-2016

Adding something like this will give me exactly what I need to make my final join, but how do I replace the values 1120 and 1130 with Totaling.1 and Totaling.2?

    TotalList = Table.AddColumn ( Source2ReplaceNulls, "TotalList", each {1120..1130} ),
    ExpandedTotalList = Table.ExpandListColumn(TotalList, "TotalList")
in
    ExpandedTotalList

/sdjensen

ImkeF · ‎07-02-2016

I think that this should work:

{[Totaling.1]..[Totaling.2]}

This will give you a list of numbers incrementing by 1. So probably containing numbers where no account no exists. That's when JoinKind.Inner comes into play in order to reduce the list to the existing account no only.

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

sdjensen · ‎07-02-2016

@ImkeF thank you that was the solution. I had already tried that but received an error, but that was because I also needed to change type on Totaling.1 and Totaling.2 to numbers - this could be a problem since the source column in NAV is an nvarchar column that can have values that can't be converted into a number. I will need to figure out a way to handle this, either by excluding those from my code or perhaps create a surrugate key instead.

I will continue working on this and get back with my final code.

Thank you for all you help so far.

/sdjensen

ImkeF · ‎07-02-2016

Good point - as it gave me the idea for a better solution: Instead of a surrogate key, create an index on your (sorted) chart of accounts (on accounts only). This way the increment by 1 will not produce unnecessary numbers where no account matches.

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

ImkeF · ‎07-02-2016

Unfortunately the "trick" with omitting the column names doesn't work - sorry. Instead we can create a dynamic one. You'll find it in steps "NoOfColumns" and "ColumnNames":

let
    ChartOfAccounts = #table({"AccountNo", "AccountType"}, {{"K100", "Account"}, {"200", "Account"}, {"K100|200", "Totalling1"}, {"300", "Account"}, {"2000", "Account"},{"3000", "Account"}, {"4000", "Account"}, {"2000..4000", "Totalling2"}, {"5000", "Account"}, {"6000", "Account"}, {"8000", "Account"}, {"9000", "Account"}, {"2000..4000|200|5000..9000", "Totalling3"}}),
    #"Filtered Rows" = Table.SelectRows(ChartOfAccounts, each ([AccountType] = "Account")),
    Indexed = Table.AddIndexColumn(#"Filtered Rows", "Index", 1, 1),
    #"Merged Queries" = Table.NestedJoin(ChartOfAccounts,{"AccountNo"},Indexed,{"AccountNo"},"NewColumn",JoinKind.LeftOuter),
    IndexedCoA = Table.ExpandTableColumn(#"Merged Queries", "NewColumn", {"Index"}, {"Index"}),
    #"Filtered Rows1" = Table.SelectRows(IndexedCoA, each (Text.StartsWith([AccountType], "Totalling"))),
    NoOfColumns = Table.AddColumn(#"Filtered Rows1", "NumberOfColumns", each List.Count(Text.PositionOf([AccountNo], "|", Occurrence.All))+1),
    ColumnNames = List.Transform({1..List.Max(NoOfColumns[NumberOfColumns])}, each "Column"&Text.From(_)),
    #"SplitColumn|" = Table.SplitColumn(NoOfColumns,"AccountNo",Splitter.SplitTextByDelimiter("|", QuoteStyle.Csv), ColumnNames),
    #"Removed Columns" = Table.RemoveColumns(#"SplitColumn|",{"Index", "NumberOfColumns"}),
    UpivotColumns = Table.UnpivotOtherColumns(#"Removed Columns", {"AccountType"}, "Attribute", "Value"),
    #"Removed Columns1" = Table.RemoveColumns(UpivotColumns,{"Attribute"}),
    #"SplitColumn.." = Table.SplitColumn(#"Removed Columns1","Value",Splitter.SplitTextByDelimiter("..", QuoteStyle.Csv),{"Totalling.1", "Totalling.2"}),
    #"Added Index" = Table.AddIndexColumn(#"SplitColumn..", "Index", 0, 1),
    IndexedAccountType = Table.AddColumn(#"Added Index", "IndexedAccountType", each Text.Combine({[AccountType], Text.From([Index], "de-DE")}, "--"), type text),
    #"Removed Columns3" = Table.RemoveColumns(IndexedAccountType,{"AccountType", "Index"}),
    AllAccountsInOneColumn = Table.UnpivotOtherColumns(#"Removed Columns3", {"IndexedAccountType"}, "Attribute", "Value"),
    LookupAccountIndex = Table.NestedJoin(AllAccountsInOneColumn,{"Value"},IndexedCoA,{"AccountNo"},"NewColumn",JoinKind.LeftOuter),
    ShowAccountIndex = Table.ExpandTableColumn(LookupAccountIndex, "NewColumn", {"Index"}, {"Index"}),
    #"Removed Columns2" = Table.RemoveColumns(ShowAccountIndex,{"Value"}),
    PivotBackTo2Columns = Table.Pivot(#"Removed Columns2", List.Distinct(#"Removed Columns2"[Attribute]), "Attribute", "Index"),
    ListOfIndices = Table.AddColumn(PivotBackTo2Columns, "AccountIndices", each if [Totalling.2]=null then {[Totalling.1]} else {[Totalling.1]..[Totalling.2]}),
    #"Expanded AccountIndices" = Table.ExpandListColumn(ListOfIndices, "AccountIndices"),
    LookupAccountNo = Table.NestedJoin(#"Expanded AccountIndices",{"AccountIndices"},IndexedCoA,{"Index"},"NewColumn",JoinKind.LeftOuter),
    ShowAccountNo = Table.ExpandTableColumn(LookupAccountNo, "NewColumn", {"AccountNo"}, {"AccountNo"}),
    CleanUpTotalling = Table.SplitColumn(ShowAccountNo,"IndexedAccountType",Splitter.SplitTextByEachDelimiter({"--"}, QuoteStyle.Csv, true),{"IndexedAccountType.1", "IndexedAccountType.2"}),
    #"Removed Other Columns" = Table.SelectColumns(CleanUpTotalling,{"IndexedAccountType.1", "AccountNo"}),
    #"Sorted Rows" = Table.Sort(#"Removed Other Columns",{{"IndexedAccountType.1", Order.Ascending}})
in
    #"Sorted Rows"

This is a full code incl. sample data which you can copy into your advanced editor and follow the steps along. But don't use Internet Explorer, because that's buggy and will insert non-printable characters or like which will cause strange error-messages!

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

sdjensen · ‎07-02-2016

Okay I got a working solution now... thanks to @ImkeF and @Greg_Deckler for all their time and help.

@ImkeF - it's strange that your trick doesn't work for you it worked for me? I would really like to accept your last code as a solution, but I haven't tested it, but I think our solutions are very close to each other.

this is my final code:

let
    // 1 til 1 relation for konti af type "konto"
    Source1 = Table.SelectColumns( Table.SelectRows( Finanskonto, each [FinanskontoTypeSort] = 0 ), {"FinanskontoKey"} ),
    Source1AddFinansKontoKeyIncluded = Table.AddColumn(Source1, "Included.FinanskontoKey", each [FinanskontoKey]), 
    
    // M til M relation for konti af type "sum" og "til-sum"
    Source2 = Table.SelectColumns( 
                Table.SelectRows( Finanskonto, each List.Contains({2, 4}, [FinanskontoTypeSort]) ), 
                {"FinanskontoKey", "Regnskab", "Finanskonto Nummer", "Totaling"}  
            ),
    Source2SplitByDelimiter = Table.SplitColumn(
                Source2,
                "Totaling",
                Splitter.SplitTextByDelimiter("|", QuoteStyle.Csv)
            ),
    Source2UnpivotedColumns = Table.UnpivotOtherColumns(
                Source2SplitByDelimiter, 
                {"FinanskontoKey", "Regnskab", "Finanskonto Nummer"}, "Attribute", "Totaling"
            ),
    Source2SplitByDelimiter2  = Table.SplitColumn(
                Source2UnpivotedColumns,
                "Totaling",
                Splitter.SplitTextByDelimiter("..", QuoteStyle.Csv),
                {"Totaling.1", "Totaling.2"}
            ),
    Source2ReplaceNulls = Table.ReplaceValue(Source2SplitByDelimiter2, null, each _[Totaling.1], Replacer.ReplaceValue,{"Totaling.2"}
            ),
    Source2AddedTotaling1Idx = Table.Join(
                Source2ReplaceNulls,
                {"Regnskab", "Totaling.1"},
                Table.PrefixColumns(Table.SelectColumns( Finanskonto, {"Regnskab", "Finanskonto Nummer", "Index"} ), "Idx1"),
                {"Idx1.Regnskab", "Idx1.Finanskonto Nummer"}     
            ),
    Source2AddedTotaling2Idx = Table.Join(
                Source2AddedTotaling1Idx,
                {"Regnskab", "Totaling.2"},
                Table.PrefixColumns(Table.SelectColumns( Finanskonto, {"Regnskab", "Finanskonto Nummer", "Index"} ), "Idx2"),
                {"Idx2.Regnskab", "Idx2.Finanskonto Nummer"}     
            ),
    Source2TotalingList = Table.AddColumn ( Source2AddedTotaling2Idx, "TotalingList", each {[Idx1.Index]..[Idx2.Index]} ),
    Source2ExpandedTotalList = Table.ExpandListColumn(Source2TotalingList, "TotalingList"),
    Source2TableJoin = Table.Join( 
                Source2ExpandedTotalList, 
                {"Regnskab", "TotalingList"},
                Table.PrefixColumns( Table.SelectColumns(Finanskonto, {"Regnskab", "Index", "FinanskontoKey"} ), "Included"),
                {"Included.Regnskab", "Included.Index"}
            ),    
    Source2RemoveOtherColumns = Table.SelectColumns(Source2TableJoin,{"FinanskontoKey", "Included.FinanskontoKey"}),
    
    CombineSource1and2 = Table.Combine({Source1AddFinansKontoKeyIncluded, Source2RemoveOtherColumns})
in
    CombineSource1and2

/sdjensen

ImkeF · ‎07-02-2016

Great - that looks neat and clean 🙂

Yes, our solutions are quite similar - you managed with one pivoting-less.

But watch out: Omitting the column names will only work if the Totalling with the most |-signs sits in the first row of your table!

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

sdjensen · ‎07-02-2016

Thank you for that last tip, I will have to handle that.

/sdjensen

Greg_Deckler · ‎07-01-2016

Does it have to be in "M"? Because CALCULATETABLE in DAX may get you there.

@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!: The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

Power Query Join tables

Helpful resources

Microsoft Fabric Learn Together

Power BI Monthly Update - April 2024

Fabric Community Update - April 2024

How to Get Your Question Answered Quickly