cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
MichaelStem
Frequent Visitor

Filtering a text column for a specific pattern

Hi all,

 

I have a text column containing part numbers. There are many different patterns for the part number, some containing just numbers, some having numbers separated by a hyphen, different lengths etc. 12345-678, ABCDEF, ABC-1234, 9876543, ZYXWVUT ...

 

I am trying to filter just those that have the specific format : first 3 characters are always letters followed by a hyphen followed by 4 numbers and nothing else after that eg ABC-1234. 

 

I have explored SEARCH, FIND, LEFT, RIGHT, ... but none are able to filter just the part numbers I am lookign for. Is there a way to filter based on a text template, or some other way to achieve what I am lookign for?

 

All help and suggestions greatefully received. 

1 ACCEPTED SOLUTION
Chihiro
Solution Sage
Solution Sage

Best bet is to create custom function. But it requires iterating through string, not particulary smart or easy to do.

 

I'd suggest using R Script in the query editor.

 

pattern <- "^[[:alpha:]]{3}\\-\\d{4}"
isMatch <- function(x) {grepl(pattern, as.character(x), ignore.case=TRUE)}
return <- within(dataset,{Flag=isMatch(dataset$PartNumber)})

Replace "PartNumber" with your actual column name.

 

return.

0.JPG

 

Then it's simply filtering based on "Flag" column.

 

Edit: You can find RegEx pattern cheat sheet for R in link below.

https://www.rstudio.com/wp-content/uploads/2016/09/RegExCheatsheet.pdf

View solution in original post

4 REPLIES 4
Chihiro
Solution Sage
Solution Sage

Best bet is to create custom function. But it requires iterating through string, not particulary smart or easy to do.

 

I'd suggest using R Script in the query editor.

 

pattern <- "^[[:alpha:]]{3}\\-\\d{4}"
isMatch <- function(x) {grepl(pattern, as.character(x), ignore.case=TRUE)}
return <- within(dataset,{Flag=isMatch(dataset$PartNumber)})

Replace "PartNumber" with your actual column name.

 

return.

0.JPG

 

Then it's simply filtering based on "Flag" column.

 

Edit: You can find RegEx pattern cheat sheet for R in link below.

https://www.rstudio.com/wp-content/uploads/2016/09/RegExCheatsheet.pdf

View solution in original post

Hi, thanks for you solution. what if I'd like to apply it to multiple columns? thanks in advance

Genius, many thanks Chihiro.

 

I had not played with R before, but installed, ran your script and it worked first time.

 

I did notice that after running the script it seemed to have created 2 rows with ")" in the PartNumber field. I have deleted these rows but not sure how they get there. They are definitely created by running the R script as they are not there before.

You are welcom. I can't think of any reason why it would create extra rows...

 

R is very powerful tool and compliments PowerBI very well.

Helpful resources

Announcements
Power BI October Update 2021.jpg

Power BI Release

Click here to read more about the October 2021 Release!

Microsoft Ignite 768x460.png

Find your focus

Explore the latest tools,training sessions,technical expertise, networking and more.

Power BI Womens Summit 2021 768 x460.jpg

Interviews, learning sessions, allies, and more!

#PowerBIWomenSummit

Teds Dev Camp Oct. 2021 768x460.jpg

Power BI Dev Camp - October 28th, 2021

Mark your calendars and join us for our next Power BI Dev Camp!