Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!
Hi,
I have just started working with Rscript. I am trying to filter columns based on pattern matching using regex in R-Script.
# 'dataset' holds the input data for this script
dataset$StrDate<- as.character(dataset$Subject)
x<-as.character(dataset$StrDate)
pattern <- "^[[:alpha:]]{4}\\s\\d{2}\\.\\d{2}\\.\\d{2}"
isMatch <- function(x) {grepl(pattern, as.character(x), ignore.case=TRUE)}
dataset$Flag=isMatch(x)
output<-dataset
My input table contains columns such as Subject, Attachments,Body,Sender etc. The attachment column has tables containing pdf attachments. However, the output from the above code is a Name, Value table. After I expand the values column I get all the columns in text format. I need the pdf attachments in Attachments column. Can someone please help with this issue.
Hi @Anonymous,
What is your data source then? Could you please share it to me together with your excepted result? So that I can test it my side.
Regards,
Frank
Thanks for your response!
I am extracting my inbox email data here from Microsoft exchange online. I cant send across the dataset due to company policy but I am attaching snapshot of all the columns.
Since I want all the emails in the inbox from Cash team I am trying to filter out all the RE: and Fw: here using Rscript. After I filter the emails R-script returns me the following table.
On expanding the values column I see that my input table column values have been converted to string format. Since in the next steps I want to extract the pdf attachment in table under Attachments column, i am not able to do so anymore.
I hope the problem and solution expected is clear.
User | Count |
---|---|
139 | |
113 | |
103 | |
73 | |
63 |
User | Count |
---|---|
136 | |
125 | |
107 | |
70 | |
61 |