[R] toupper does not work in sub + regex
William Dunlap
wdunlap at tibco.com
Mon Apr 13 19:46:34 CEST 2009
> From: Tan, Richard [mailto:RTan at panagora.com]
> Sent: Monday, April 13, 2009 10:23 AM
> To: William Dunlap
> Cc: r-help at r-project.org
> Subject: RE: [R] toupper does not work in sub + regex
>
> Thanks, Bill! One more question, how do I get SviRaw, i.e., just
> uppercase the 1st char and keep everything else the same?
>
> sub("q_([a-z])([a-zA-Z]*)", "\\U\\1 \\2", "q_sviRaw",perl=TRUE)
The easiest way to do that is to use Gabor's gsubfn package.
There doesn't appear to be a \\<letter> code that means don't convert
the rest of the replacement items to upper or lower case.
You could use other methods to avoid gsubfn. E.g.,
> sub("(q_)([a-z])", "\\U\\2", "q_sviRaw",perl=TRUE)
[1] "SviRaw"
>
> Did not work.
>
> Thank you!
> Richard
>
> -----Original Message-----
> From: William Dunlap [mailto:wdunlap at tibco.com]
> Sent: Monday, April 13, 2009 1:17 PM
> To: Tan, Richard; r-help at r-project.org
> Subject: Re: [R] toupper does not work in sub + regex
>
> You could also use \\U and \\L in the replacement with perl=TRUE. \\U
> "converts the rest of the replacement to upper case" and \\L
> converts to
> lowercase. (By "replacement" it means the parts of the
> replacement that
> arise from parenthesized subpatterns in the pattern argument, not the
> replacement argument itself.) E.g.,
>
> > sub("q_([a-z])[a-zA-Z]*", "\\U\\1\\L", "q_sviRaw", perl=TRUE)
> [1] "S"
> > sub("q_([a-z])([a-zA-Z]*)", "\\U\\1 then \\L\\2", "q_sviRaw",
> perl=TRUE)
> [1] "S then viraw"
> > sub("q_([a-z])([a-zA-Z]*)", "\\U\\1 then \\2", "q_sviRaw",
> perl=TRUE)
> [1] "S then VIRAW"
>
> Bill Dunlap
> TIBCO Software Inc - Spotfire Division
> wdunlap tibco.com
>
> ----------------------------------------------------------------------
> [R] toupper does not work in sub + regex
>
> Gabor Grothendieck ggrothendieck at gmail.com Mon Apr 13 18:26:12 CEST
> 2009
>
> sub only handles replacement strings, not replacement functions.
> Your code is the same as:
>
> sub("q_([a-z])[a-zA-Z]*", '\\1', "q_sviRaw")
>
> since toupper('\\1') has no alphabetics so its just literally
> '\\1' and
> the latter is what sub uses.
>
> The gsubfn function in the gsubfn package can deal with replacement
> functions:
>
> > library(gsubfn)
> > gsubfn("q_([a-z])[a-zA-Z]*", toupper, "q_sviRaw")
> [1] "S"
>
> See the home page: http;//gsubfn.googlecode.com, vignette and
> help page.
>
> On Mon, Apr 13, 2009 at 11:54 AM, Tan, Richard <RTan at panagora.com>
> wrote:
> > Hi, I don't know what I am doing wrong to the toupper does not seem
> > working in sub + regex. The following returns 's' not the
> upper class
>
> > 'S' as I expect:
> >
> > sub("q_([a-z])[a-zA-Z]*",toupper('\\1'),"q_sviRaw")
> >
> > Can someone tell me where I did wrong?
> >
> > Thanks,
> > Richard
>
>
More information about the R-help
mailing list