[R] Regular expression to define contents between parentheses
gunter.berton at gene.com
Tue Aug 25 22:41:46 CEST 2009
I believe that this is, indeed, tough; it might require PERL regex's to do
entirely within the regular expression language. You might also wish to
check out the gsubfn package to see if it could help.
However, a reasonably simple alternative approach that I think will work is
to use strsplit():
1. Split on "("
2. lapply on the resulting list of vectors and remove all elements from each
vector that contain a ")" using, e.g. grep().
3. sapply paste() on the now "cleaned" list to get back the cleaned up
I leave it to you to work out details -- or point out why I'm wrong.
Alternatively, wait for someone smarter to reply -- which I'm sure will
occur given the clarity with which you posed your problem.
Genentech Nonclinical Biostatisics
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of Judith Flores
Sent: Tuesday, August 25, 2009 1:18 PM
Subject: [R] Regular expression to define contents between parentheses
Hello dear R-helpers,
I haven't been able to figure out of find a solution in the R-help
archives about how to delete all the characters contained in groups of
parenthesis. I have a vector that looks more or less like this:
myvector<-c("something (80 km/h, sd) & more (6 kg/L,sd)", "somethingelse (48
m/s, sd) & moretoo (50g/L , sd)")
I want to extract all the strings that are not contained in parenthesis, the
goal would be to obtain the following new vector:
subvector<-c("something & more", "somethingelse & moretoo")
I tried the following, but this pattern seems to enclose all that is
included between the first opened parenthesis and the last closed
parethesis, which makes sense, but it's not what I need:
Your help will be very appreciated.
R-help at r-project.org mailing list
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help