[R] Regular expression to find value between brackets

Gabor Grothendieck ggrothendieck at gmail.com
Wed Oct 13 22:14:23 CEST 2010


On Wed, Oct 13, 2010 at 2:16 PM, Bart Joosen <bartjoosen at hotmail.com> wrote:
>
> Hi,
>
> this should be an easy one, but I can't figure it out.
> I have a vector of tests, with their units between brackets (if they have
> units).
> eg tests <- c("pH", "Assay (%)", "Impurity A(%)", "content (mg/ml)")
>

strapply in gsubfn can match by content which is what you want.

We use a regular expression which is a literal left paren, "\\("
followed by a capturing paren ( followed by the longest string not
containing a right paren [^)]* followed by the matching capturing
paren "\\)" with strapply from the gsubfn package.  This returns the
matches to the function that is in the third arg and it just
concatenates them.  The result is simplified into a character vector
(rather than a list).

library(gsubfn)
strapply(tests, "\\(([^)]*)\\)", c, simplify = c)

e.g.

> strapply(tests, "\\(([^)]*)\\)", c, simplify = c)
[1] "%"     "%"     "mg/ml"

See http://gsubfn.googlecode.com for the gsubfn home page and more info.

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com



More information about the R-help mailing list