cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Greg_Deckler
Super User
Super User

joinAlgorithm and keyEqualityComparers

In looking at the documentation for Table.Join and Table.NestedJoin, there is a brief mention of optional parameters of joinAlgorithm and keyEqualityComparers. However, I cannot find any information on how to use these. Does anyone have any examples of these? 

 

I particularly want to know if these can be used to support unequal matching conditions, for example, only join if > or < versus =.

 

@ImkeF?


@ me in replies or I'll lose your thread!!!
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Learn Power BI 2nd Edition
1 ACCEPTED SOLUTION

Now I feel really honoured - thank you guys 🙂

 

Honestly: I haven't recognized these 2 parameters before. And while the MS-documentation holds some things about the joinAlgorithms in here: https://msdn.microsoft.com/en-us/library/mt296614.aspx, there's zero about keyEqualityComparers.

 

But with regards to the desired relative comparison here (< or >), I dare to say that they probably wouldn't help, as M distinguishes between equality and relational comparisons.

 

So in order to not let you down here, please have a look at the following query, which performs a relative lookup (and will hopefully appeal to you DAX-masters 🙂 😞

 

let
    Table1 = #table({"Key1"},{{10},{20},{30},{40}}),
    Table2 = #table({"Key2"},{{9},{19},{29},{39}}),
    RelativeMerge = Table.AddColumn(Table1, "RelativeJoin", 
            (Earlier) => Table.SelectRows(Table2, 
                         each [Key2]<Earlier[Key1])),
    #"Expanded RelativeJoin" = Table.ExpandTableColumn(RelativeMerge, "RelativeJoin", {"Key2"}, {"Key2"})
in
    #"Expanded RelativeJoin"

 

 

(This is nested row-context: Yes, we have evaluation context in M as well & fortunately it doesn't behave like a moving target 🙂 )

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

View solution in original post

24 REPLIES 24
freelensia
Helper III
Helper III

Hi @ImkeF 

I converted your code into the function below.

 

(LookInTbl as table, KeywordTbl as table, LookInCol as text, KeywordCol as text) =>
// https://community.powerbi.com/t5/Desktop/joinAlgorithm-and-keyEqualityComparers/m-p/182148#M79842

let
    RelativeMerge = Table.AddColumn(LookInTbl, "RelativeJoin", 
        (Earlier) => Table.SelectRows(KeywordTbl, 
                each Text.Contains(Table.Column(Earlier,LookInCol),Table.Column(KeywordTbl,KeywordCol), Comparer.OrdinalIgnoreCase))),
    ExpandRelativeJoin = Table.ExpandTableColumn(RelativeMerge, "RelativeJoin", {KeywordCol}, {KeywordCol})
in
    ExpandRelativeJoin

Then called it like this:

RelativeMerge = SearchColForKeywords(GetWordsTbl, KeywordTbl, "English", "Keywords"),  

But it generated this error.

Expression.Error: We cannot convert a value of type List to type Text.
Details:
    Value=[List]
    Type=[Type]

 Could you check to see if I wrote something incorrectly?
Thanks!

Hi @freelensia ,

a table reference followed by a column name will return a list and not a text (for the first argument of the Text.Contains-function). Hence the error message. To transform my query into a function you have to operate on a record-level like so instead:

 

let
    LookInTbl = Table.Buffer(#table({"Key1"},{{"Auto"},{"Bus"},{"Autobus"}, {"Car"}})),
    KeywordTbl = Table.Buffer(#table({"Key2"},{{"Auto"},{"Bus"}})),
    LookInCol = "Key1",
    KeywordCol = "Key2",
    fnRelMerge = (LookInTbl as table, KeywordTbl as table, LookInCol as text, KeywordCol as text) =>
    let
            RelativeMerge = Table.AddColumn(LookInTbl, "RelativeJoin", 
            (Earlier) => Table.SelectRows(KeywordTbl, 
                         each Text.Contains(Record.Field(Earlier, LookInCol), Record.Field(_, KeywordCol), Comparer.OrdinalIgnoreCase))),
            #"Expanded RelativeJoin" = Table.ExpandTableColumn(RelativeMerge, "RelativeJoin", {"Key2"}, {"Key2"})
    in
        #"Expanded RelativeJoin",
    CallFunction = fnRelMerge(LookInTbl, KeywordTbl, LookInCol, KeywordCol )
in
    CallFunction

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

This works great! I just wanna share my final function. In this function I added 2 steps: filtering out non-matching rows, and removing the additional column. Hope this helps other people.

 

{SearchColForKeywords}
(LookInTbl as table, KeywordTbl as table, LookInCol as text, KeywordCol as text) =>

let
    RelativeMerge = Table.AddColumn(LookInTbl, "RelativeJoin", 
        (Earlier) => Table.SelectRows(KeywordTbl, 
            each Text.Contains(Record.Field(Earlier, LookInCol), Record.Field(_, KeywordCol), Comparer.OrdinalIgnoreCase))),
    ExpandRelativeJoin = Table.ExpandTableColumn(RelativeMerge, "RelativeJoin", {KeywordCol}, {"Keywords found"}),
    FilterRows = Table.SelectRows(ExpandRelativeJoin, each [Keywords found] <> null and [Keywords found] <> ""),
    RemoveColumn = Table.RemoveColumns(FilterRows,{"Keywords found"})
in
    RemoveColumn
otravers
Continued Contributor
Continued Contributor

@freelensia can you please share a sample file with your function invoked to add a custom column to a table? It looks exactly like what I need but I'm struggling to pick the right table and column references (I'm easily confused!).

 

Edit: for reference, I ended up doing a partial match by using a cartesian product as described in this post:

https://www.myspreadsheetlab.com/power-query-solution-partial-match-count-with-a-condition/

 

Note that this will create a bunch of duplicate rows, so make sure to have an index column for your initial rows (before the cartesian product), then sort (decreasing) by the match column (the one with Text.PositionOf) then use Table.Buffer before you can finally remove duplicates:

https://www.youtube.com/watch?v=rqDdnNxSgHQ

@otraverspl see my improved code below.

Hi guys, just sharing my improved function one last time.

 

There was a small problem in my previous solution. If multiple keywords matched, the function will duplicate the row. For example if we have keywords: "cats", "dogs", "birds" and the cell has this text "The dogs chased the cats while the birds watched", you will have 3 identical rows each with one keyword that matched.

So we needed to group and concatenate matched keywords.

 

//Keep rows of table LookInTbl if its value in LookInCol contains any keyword in KeywordCol of KeywordTbl
//Used like this: 
//    NextStep = KeepRowsIfFindAnyKW(LastStep, "Word ID", KeywordLang, KeywordsTbl, "Keywords"),

(LookInTbl as table, IDCol as text, LookInCol as text, KeywordTbl as table, KeywordCol as text) =>

let
    //Search a column for any text from another table https://community.powerbi.com/t5/Desktop/joinAlgorithm-and-keyEqualityComparers/m-p/1013429#M479534
    RelativeMerge = Table.AddColumn(LookInTbl, "RelativeJoin", 
        (Earlier) => Table.SelectRows(KeywordTbl, 
            each Text.Contains(Record.Field(Earlier, LookInCol), Record.Field(_, KeywordCol), Comparer.OrdinalIgnoreCase))), //case insensitive
    ExpandRelativeJoin = Table.ExpandTableColumn(RelativeMerge, "RelativeJoin", {KeywordCol}, {"Keywords Found"}),
    FilterRows = Table.SelectRows(ExpandRelativeJoin, each [Keywords Found] <> null and [Keywords Found] <> ""),

    // Group by first keyword and keep all columns https://community.powerbi.com/t5/Power-Query/Group-by-date-and-keep-all-columns/td-p/794601
    GroupRows = Table.Group(FilterRows, {IDCol}, {{"First Keyword Found", each List.Max([Keywords Found]), type text}, {"All Data", each _, type table}}),
    ExpandAll = Table.ExpandTableColumn(GroupRows, "All Data", List.RemoveItems(Table.ColumnNames(GroupRows[All Data]{0}), {IDCol})), //dynamically expanding all columns except Word ID and Keywords found https://eriksvensen.wordpress.com/2019/10/05/powerquery-control-the-expand-columns-so-it-includes-new-columns/
    SelectRows = Table.SelectRows(ExpandAll,each [First Keyword Found]=[Keywords Found]),
    RemoveCols = Table.RemoveColumns(SelectRows,{"First Keyword Found", "Keywords Found"})
in
    RemoveCols

 

Anonymous
Not applicable

It looks like the documentation for keyEqualityComparers is the same as the joinOptions for Table.FuzzyJoin

 

https://docs.microsoft.com/en-us/powerquery-m/table-fuzzyjoin

 

I've played with them a little and gotten them to work, i.e.:

 

  #"Merged Queries" = Table.NestedJoin(Source, {"Text1", "Text2"}, #"VCP-Microsoft-Unifications", {"Name", "Text2"}, "VCP Microsoft Unifications", JoinKind.LeftOuter, [IgnoreCase = true, IgnoreSpace = false]),

 

Hope that helps.

 

Zig.

 

tringuyenminh92
Memorable Member
Memorable Member

haha, i think you are asking correct guy @Greg_Deckler, @ImkeF will like this Heart

@tringuyenminh92a slight correction - the correct gal Smiley Happy

 

oh, After knowing your correction @Sean, I really want to show my admirable to her.

Now I feel really honoured - thank you guys 🙂

 

Honestly: I haven't recognized these 2 parameters before. And while the MS-documentation holds some things about the joinAlgorithms in here: https://msdn.microsoft.com/en-us/library/mt296614.aspx, there's zero about keyEqualityComparers.

 

But with regards to the desired relative comparison here (< or >), I dare to say that they probably wouldn't help, as M distinguishes between equality and relational comparisons.

 

So in order to not let you down here, please have a look at the following query, which performs a relative lookup (and will hopefully appeal to you DAX-masters 🙂 😞

 

let
    Table1 = #table({"Key1"},{{10},{20},{30},{40}}),
    Table2 = #table({"Key2"},{{9},{19},{29},{39}}),
    RelativeMerge = Table.AddColumn(Table1, "RelativeJoin", 
            (Earlier) => Table.SelectRows(Table2, 
                         each [Key2]<Earlier[Key1])),
    #"Expanded RelativeJoin" = Table.ExpandTableColumn(RelativeMerge, "RelativeJoin", {"Key2"}, {"Key2"})
in
    #"Expanded RelativeJoin"

 

 

(This is nested row-context: Yes, we have evaluation context in M as well & fortunately it doesn't behave like a moving target 🙂 )

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

That was a super helpful example code @ImkeF, thanks so much! I was able to apply it to my own situation super easily!

Fantastic example!  Thank you for providing it.  Do you think the relative merge function you provided below can be modified with a Text.Contains function, so you could essentially do a partial text match merge on two tables?

Sure!

This query will match all words from Table2 who are somehow included in the strings of Table1 in a case insensitive mode:

 

let
    Table1 = Table.Buffer(#table({"Key1"},{{"Auto"},{"Bus"},{"Autobus"}, {"Car"}})),
    Table2 = Table.Buffer(#table({"Key2"},{{"Auto"},{"Bus"}})),
    RelativeMerge = Table.AddColumn(Table1, "RelativeJoin", 
            (Earlier) => Table.SelectRows(Table2, 
                         each Text.Contains(Earlier[Key1],[Key2], Comparer.OrdinalIgnoreCase))),
    #"Expanded RelativeJoin" = Table.ExpandTableColumn(RelativeMerge, "RelativeJoin", {"Key2"}, {"Key2"})
in
    #"Expanded RelativeJoin"

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

This SAVED ME Major!

 

It was small, but gave me much to read and apply!

 

I even added in the Table.FirstN to limit my matches to 1. 

 

let Table1 = Table.Buffer(TransactionsOnly), /* Bank Transactions already cleaed in another query*/ Table2 = Table.Buffer(MatchMatrix), /*Excel File With FROM and TO Columns to be used in my compare, from another query*/ FakeMerge = Table.AddColumn( Table1,"RelMerge", (OrigTable) => Table.FirstN( Table.SelectRows( Table2, each Text.Contains(OrigTable[TheDESCR],[From] ,Comparer.OrdinalIgnoreCase) ),1) /*Limits to first row match*/ ), #"Expanded RelMerge" = Table.ExpandTableColumn(FakeMerge, "RelMerge", {"To"}, {"To"}), #"Replaced Value" = Table.ReplaceValue(#"Expanded RelMerge",null,"Other",Replacer.ReplaceValue,{"To"}), #"Changed Type" = Table.TransformColumnTypes(#"Replaced Value",{{"Amount", type number}}), #"Filtered Rows" = Table.SelectRows(#"Changed Type", each ([Source] = " Capital 1 - 1324")) in #"Filtered Rows"

 

ignore the stuff below I am still troubleshooting somethings, but wanted to share this! very powerful stuff! 

 

With my approach as new transactions appear, I can easily update the excel file and it will now have a new value to test against and convert 

Hi @ImkeF ,

 

Sorry for replying to an old post, but needed your valuable help to my query as this problem looked similar to mine:

 

I have a single column table named WordList containing a unique list of values, whose each word value i want to do a partial match with phrases in multiple Attribute columns in another table named Data Extract and get a count of the item codes that show these word values. I want to count only the First occurence of a word value for each item

 

e.g. A word value may occur in a single phrase multiple times OR across multiple Attributes for that Item, but it should be counted as 1.

 

WordList Table:

 

Words

auto

bus

ingenius

natural

color

 

 

DataExtract Table:

 

ItemID    Attr1                       Attr2                        Attr3                      Attr4

10001      natural and soft     put in auto mode      so natural!               get auto; get auto mode

10002      bus at stop             natural icecream       boy is ingenius        more power

10003      let me out!             promote goodwill     simple simon           auto model

10004      matching color      natural color              natural essence       figure it!

10005      god will                  meet at bus stop       similar cases            rosetta stone

10006      mud ringworm       helper columns         rainbow color          geneva convention

10007      masked man          tenor voice                silica gel                   natural smell

10008      microcosm             guns of navarone      bloated cell              enigmatic man

10009      cellular sky             i am natural               color coated            ingenius of him!

10010      ingenius body        maker chambers       old story                  minor changes

 

 

Output Table:

Words         ItemCount          %ItemCount

auto                   2                              =2/10

bus                     2                              =2/10

ingenius             3                              =3/10

natural               5                              =5/10

color                  3                              =3/10

 


How do i find the Item Counts?

 

not sure what you mean with "partial match":

1) Would "Autobus" return a match (or even 2)?

 

and if so: only for "bus" or also for "auto" - leading to the second question:

 

2) Shall this be case sensitive or not?

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Hi @ImkeF ,

 

Thanks for your reply. 🙂

 

Sorry, i added a new post later with the complete issue on the Desktop board. Got a solution using DAX in PowerBI, but needed a Power Query solution in Excel as i am on Excel 2016 and as the data is huge.

 

To answer your question - sorry again for the wrong word used ie. partial. What i simply meant is:

 

Only search for each word given in the Wordlist Table in the phrases found in the Attribute columns in DataExtract table. and get a count of the items that have atleast one occurence of each word. Also, it will be case-insensitive.

 

You may reply to that new post. Again, apologies for the replying to an old post, instead of starting a new post.

Thank you!

If you run into performance issues with these techniques, it's worth checking out if you can partition your tables: http://www.thebiccountant.com/2017/05/29/performance-tip-partition-tables-crossjoins-possible-powerq...

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Helpful resources

Announcements
Microsoft Build 768x460.png

Microsoft Build is May 24-26. Have you registered yet?

Come together to explore latest innovations in code and application development—and gain insights from experts from around the world.

charticulator_carousel_with_text (1).png

Charticulator Design Challenge

Put your data visualization and design skills to the test! This exciting challenge is happening now through May 31st!

May UG Leader Call Carousel 768x460.png

What difference can a User Group make for you?

At the monthly call, connect with other leaders and find out how community makes your experience even better.

Power BI Dev Camp Session 22 768x460.jpg

Check it out!

Mark your calendars and join us on Thursday, May 26 at 11a PDT for a great session with Ted Pattison!