Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
I have a large list of product data from Amazon (5000 rows) and need to remove irrelevant products based on text that shows in the product title column.
So I have used powerquery and the Advanced Filter option found in the header filter tool.
Yet it is not clearer down the data. I need it to match all of the different words I add to the list of excluded word.
I've tried using the "OR" and "AND" options when adding new clauses, but it is still not working.
I can imagine there are workarounds by creating new custom columns, but I would prefer to use built in features like this if indeed they do work and I'm just using it wrong.
So only solutions that use the tool please or can someone from Microsoft confirm if this a known bug?
The code for my query shows as follows:
(NOTE I have replaced the text strings from my actual data with placeholders)
= Table.SelectRows(#"Trimmed Text", each not Text.Contains([Title], "WORD 1") and not Text.Contains([Title], "WORD 2") and not Text.Contains([Title], "WORD 3") and not Text.Contains([Title], "WORD 4") and not Text.Contains([Title], "WORD 5") and not Text.Contains([Title], "WORD 6") and not Text.Contains([Title], "WORD 7") and not Text.Contains([Title], "WORD 8") and not Text.Contains([Title], "WORD 9") and not Text.Contains([Title], "WORD 10") and not Text.Contains([Title], "WORD 11") and not Text.Contains([Title], "WORD 12") and not Text.Contains([Title], "WORD 13") and not Text.Contains([Title], "WORD 14") and not Text.Contains([Title], "WORD 15") and not Text.Contains([Title], "WORD 16") and not Text.Contains([Title], "WORD 17"))
Solved! Go to Solution.
=Table.SelectRows(#"Trimmed Text", each not List.Contains({"Word 1","Word 2","Word 3"},[Title],(x,y)=>Text.Contains(y,x)))
=Table.SelectRows(#"Trimmed Text", each not List.Contains({"Word 1","Word 2","Word 3"},[Title],(x,y)=>Text.Contains(y,x)))
Thanks Vijay
Appreciate you going to the effort of building some custom code to fix this issue. I'm still pretty much a novice at using PowerQuery, so have no idea how to take you code and make the necessary amends needed to get it too work with my data. i.e. the example I gave in my original post used placeholder text of Word 1, Word 2, etc, which you have referenced in your code I see. But the actual words in my data are all unique words. So not sure how that would impact on your code?
Also I have no idea how to take your code and apply it to my original powerquery query which also has other steps to clean up the raw data.
I see you are a poweruser, I don't suppose you know why Microsoft's built in Advanced Filters is not working?
1. It will work. If your all words are unique, most likely reason is that it is not working because of case sensitivity. PQ is super case-sensitive. Whereever you have used Text.Contains, you need to use Comparer.OrdinalIgnoreCase parameter to switch off case-sensitivity.
Look below
Text.Contains([Title], "WORD 1",Comparer.OrdinalIgnoreCase)
2. There are other techniques also available like having a list of Words on which you want to performing filtering. But you already have a workable query which will give the result after using Comparer.OrdinalIgnoreCase.
-------------------------------------------------------------------------------------------------------
👍 It's been a pleasure to help you | Help Hours: 11 AM to 9 PM (UTC+05:30)
How to get your questions answered quickly -- How to provide sample data
-------------------------------------------------------------------------------------------------------
See the working here - Open a blank query - Home - Advanced Editor - Remove everything from there and paste the below code to test. For you relevant step is Custom1
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSkxKUQjPL0pRMFSK1YlWAjFNzUzA7JTUNIiUhUJGVgpYyFEhKT8/G6LSP8hFwdAMKJWTqhQbCwA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Title = _t]),
#"Trimmed Text" = Table.TransformColumns(Source,{{"Title", Text.Trim, type text}}),
Custom1 = Table.SelectRows(#"Trimmed Text", (x) => not List.AnyTrue(List.Transform({1..17}, each Text.Contains(x[Title], "Word " & Text.From(_), Comparer.OrdinalIgnoreCase))))
in
Custom1
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.