Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
C4YNelis
Advocate II
Advocate II

Matching column end against a list of values

Dear all,

 

not being a true M Query expert, I have tried to match a list of values against the ending of text values in a column.

 

The list of values to match can be thought of like a list of categories: {BOOKS, TOYS, CARS, FRUIT} (this list hardly makes sense, I know, it's just for illustration purposes) 🙂

 

The column of text values would be like: [PROD_NO] = {SomeBook-BOOKS, differentBook-BOOKS, notABookAtAll, car-TOYS, carCARS, ***randomFruit-FRUIT} - the idea is that there is no real structured link between the items, nor is there a clear separator or even clearity whether there is a matching ending to begin with (no pun intended).

 

So, after some trying and with the help I've found from searching around I have managed to do what I wanted, which is this:

I've generated a new column which contains the matching value from the list of categories when the column PROD_NO ends with that specific ending. So, in this case, the table would look like:

PROD_NOCategory
SomeBook-BOOKSBOOKS
differentBook-BOOKSBOOKS
notABookAtAllnull
car-TOYSTOYS
carCARSCARS
***randomFruit-FRUITFRUIT

 

Now, this works with the formula's below:

 

#"Added Custom" = Table.AddColumn(#"Source_Table", "Category", each let Prodno = [PROD_NO] in List.Select(#"list of categories", each Text.EndsWith(Prodno, _)){0}),
#"Replaced Errors" = Table.ReplaceErrorValues(#"Added Custom", {{"Category", null}})

 

However, I have three concerns with this approach:

- I personally believe (read: I've always been tought) it is bad practise to rely on errorhandling as part of the main routine and thus "abuse" the error thrown in this way, rather than utilising a proper formula.

- The other thing is that it is quite slow for a medium sized source (say: table with appr. 50k rows and only four values in the list - it takes about a minute to load).

- And finally, when I'm trying to see if it does find any values, by clicking the column drop down check, it doesn't show me anything (there aren't that many "hits" in my original source list, probably none in the first 200 records loaded by default and it's too slow to get the remainder of the rows in proper time).

 

So my question is, how do I solve this in a proper way (and preferably in a more efficient way)?

 

Thank you for your time and help,

 

cheers,

Niels

1 ACCEPTED SOLUTION
Mariusz
Community Champion
Community Champion

Hi @C4YNelis 

 

Try this.

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WCs7PTXXKz8/WdfL39w5WitWJVkrJTEtLLUrNK0ETz8svcQQJOZY45uSARZITi3RD/CODYRxnxyAIW0tLqygxLyU/162oNLNE1y0o1DNEKTYWAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [PROD_NO = _t]),
    category = List.Buffer( {"BOOKS", "TOYS", "CARS", "FRUIT" } ),
    #"Added Custom" = Table.AddColumn( Source, "Custom", each 
        List.Select( 
            category, 
            (cat) => Text.EndsWith( [PROD_NO], cat ) 
        ) 
    ),
    #"Expanded Custom" = Table.ExpandListColumn(#"Added Custom", "Custom"),
    #"Changed Type" = Table.TransformColumnTypes(#"Expanded Custom",{{"Custom", type text}})
in
    #"Changed Type"
 
To avoid errors you can just expand the added column If the list is empty it will read it as null.
Try using List.Buffer() on your list, it should improve the performance.
 
Best Regards,
Mariusz

If this post helps, then please consider Accepting it as the solution.

Please feel free to connect with me.
LinkedIn

 

View solution in original post

4 REPLIES 4
Mariusz
Community Champion
Community Champion

Hi @C4YNelis 

 

Try this.

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WCs7PTXXKz8/WdfL39w5WitWJVkrJTEtLLUrNK0ETz8svcQQJOZY45uSARZITi3RD/CODYRxnxyAIW0tLqygxLyU/162oNLNE1y0o1DNEKTYWAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [PROD_NO = _t]),
    category = List.Buffer( {"BOOKS", "TOYS", "CARS", "FRUIT" } ),
    #"Added Custom" = Table.AddColumn( Source, "Custom", each 
        List.Select( 
            category, 
            (cat) => Text.EndsWith( [PROD_NO], cat ) 
        ) 
    ),
    #"Expanded Custom" = Table.ExpandListColumn(#"Added Custom", "Custom"),
    #"Changed Type" = Table.TransformColumnTypes(#"Expanded Custom",{{"Custom", type text}})
in
    #"Changed Type"
 
To avoid errors you can just expand the added column If the list is empty it will read it as null.
Try using List.Buffer() on your list, it should improve the performance.
 
Best Regards,
Mariusz

If this post helps, then please consider Accepting it as the solution.

Please feel free to connect with me.
LinkedIn

 

Dear @Mariusz ,

 

Thank you for your quick reply and solution.

 

This is all that I was looking for and more! I never new about List.Buffer, but that seems worth remembering.

 

Just out of curiousity, if you don't mind, can you tell me what the difference is between calling:

each VAR = expression in ...

or

(VAR) => expression?

 

As far as I understand they do somewhat of the same? And does this way of calling the function also impact performance, or is that all List.Buffer?

 

Again, thank you for your help,

best regards,

Niels

Mariusz
Community Champion
Community Champion

Hi @C4YNelis 

 

()=> is just esthetics, and a preference I guess.

If you did get any performance gains, it would be from List.Buffer(), this similarly as Table.Buffer() caches the list/table in memory (RAM) for use during the code execution.

 

Best Regards,
Mariusz

Please feel free to connect with me.
LinkedIn

 

Thank you very much for your help @Mariusz 

 

It provided me with a huge performance gain and answered another question in the process. Loading the report went from well over a minute to a matter of seconds again. I'll definitely be utilizing this option much more in the future!

 

For now, it's time to revise some older queries currently running in production... 😉

 

Cheers,

Niels

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors