[R] toupper does not work in sub + regex

William Dunlap wdunlap at tibco.com
Mon Apr 13 19:16:43 CEST 2009

You could also use \\U and \\L in the replacement
with perl=TRUE.  \\U "converts the rest of the replacement
to upper case" and \\L converts to lowercase. (By
"replacement" it means the parts of the replacement
that arise from parenthesized subpatterns in the pattern
argument, not the replacement argument itself.)  E.g.,

> sub("q_([a-z])[a-zA-Z]*", "\\U\\1\\L", "q_sviRaw", perl=TRUE)
[1] "S"
> sub("q_([a-z])([a-zA-Z]*)", "\\U\\1 then \\L\\2", "q_sviRaw",
[1] "S then viraw"
> sub("q_([a-z])([a-zA-Z]*)", "\\U\\1 then \\2", "q_sviRaw", perl=TRUE)
[1] "S then VIRAW"

Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com 

[R] toupper does not work in sub + regex

Gabor Grothendieck ggrothendieck at gmail.com 
Mon Apr 13 18:26:12 CEST 2009

sub only handles replacement strings, not replacement functions.
Your code is the same as:

sub("q_([a-z])[a-zA-Z]*", '\\1', "q_sviRaw")

since toupper('\\1') has no alphabetics so its just literally '\\1' and
the latter is what sub uses.

The gsubfn function in the gsubfn package can deal with replacement

> library(gsubfn)
> gsubfn("q_([a-z])[a-zA-Z]*", toupper, "q_sviRaw")
[1] "S"

See the home page: http;//gsubfn.googlecode.com, vignette and help page.

On Mon, Apr 13, 2009 at 11:54 AM, Tan, Richard <RTan at panagora.com>
> Hi, I don't know what I am doing wrong to the toupper does not seem
> working in sub + regex.  The following returns 's' not the upper class
> 'S' as I expect:
> sub("q_([a-z])[a-zA-Z]*",toupper('\\1'),"q_sviRaw")
> Can someone tell me where I did wrong?
> Thanks,
> Richard

More information about the R-help mailing list