Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Ezay
New Member

3 same REGEX queries with R script and 3 different results.

Hello, this is my first topic here, i'm glad to share with you my "REGEX with R" issues.
This was also my first calculated column with a R script but I really needed this language to test data against REGEX.

This is my script : 

 

 

 

 

 

# 'dataset' contient les données d'entrée pour ce script

Regex_apply <- function(x,y) {  grepl(x , y)}
output <- within(dataset,{ValidRegex=Regex_apply(dataset$REGEX_simple, dataset$VAT )})

 

 

 

 It's build to apply the REGEX value contained in the REGEX_simple column to the VAT code in the VAT column and to return a boolean TRUE/FALSE.
I'm working on a table with companies in around 30 different countries and i expect to check if their VAT code is OK.
My script works well when i reduce the scope with only one country (example : Germany) and find the good and the bad ones.
When i add a second country in my scope, the results are fine for 1 country and totally wrong for the other country 
When i go for the full scope, the results are either OK or completely false without my being able to detect a pattern on what separates why one country is OK and another not.

Ezay_1-1641920420342.png

 

Ezay_0-1641920374387.png

 

 



( i had to hide some data but there were 9 numbers after DE and 12 after the SE, the result must return true for each row, in each array).
But we have : 

numbers of countryDESE
1OKOK
2OKKO
fullKOOK


I think that i missed something and i will be really happy if someone could help me find what it is.

1 REPLY 1
Icey
Community Support
Community Support

Hi @Ezay ,

 

This issue is caused by the argument 'pattern' can only use the first element. 

Icey_2-1642148398506.png

Reference: grep: Pattern Matching and Replacement (rdrr.io)

 

Icey_1-1642148255744.png

 

 

Then I refer to this post to get a workaround:

 

# 'dataset' holds the input data for this script
Regex_apply <- function(x,y){
if(grepl(x, y)){
  check <- TRUE
} else{
  check <- FALSE
}
check
}
library(purrr)
output <- within(dataset,{ValidRegex=map2(dataset$REGEX_simple,dataset$VAT,Regex_apply)})

 

Icey_0-1642148206509.png

 

However, this won't work in Power BI. 

Icey_0-1642148950880.png

 

Please give me some time to research. Once there is a solution, I will post it as soon as possible.

 

 

Best Regards,

Icey

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors