[R] toupper does not work in sub + regex

William Dunlap wdunlap at tibco.com
Mon Apr 13 19:46:34 CEST 2009


> From: Tan, Richard [mailto:RTan at panagora.com] 
> Sent: Monday, April 13, 2009 10:23 AM
> To: William Dunlap
> Cc: r-help at r-project.org
> Subject: RE: [R] toupper does not work in sub + regex
> 
> Thanks, Bill!  One more question, how do I get SviRaw, i.e., just
> uppercase the 1st char and keep everything else the same?  
> 
> sub("q_([a-z])([a-zA-Z]*)", "\\U\\1 \\2", "q_sviRaw",perl=TRUE)

The easiest way to do that is to use Gabor's gsubfn package.
There doesn't appear to be a \\<letter> code that means don't convert
the rest of the replacement items to upper or lower case.

You could use other methods to avoid gsubfn.  E.g.,
  > sub("(q_)([a-z])", "\\U\\2", "q_sviRaw",perl=TRUE)
  [1] "SviRaw"
  

> 
> Did not work. 
> 
> Thank you!
> Richard
> 
> -----Original Message-----
> From: William Dunlap [mailto:wdunlap at tibco.com] 
> Sent: Monday, April 13, 2009 1:17 PM
> To: Tan, Richard; r-help at r-project.org
> Subject: Re: [R] toupper does not work in sub + regex
> 
> You could also use \\U and \\L in the replacement with perl=TRUE.  \\U
> "converts the rest of the replacement to upper case" and \\L 
> converts to
> lowercase. (By "replacement" it means the parts of the 
> replacement that
> arise from parenthesized subpatterns in the pattern argument, not the
> replacement argument itself.)  E.g.,
> 
> > sub("q_([a-z])[a-zA-Z]*", "\\U\\1\\L", "q_sviRaw", perl=TRUE)
> [1] "S"
> > sub("q_([a-z])([a-zA-Z]*)", "\\U\\1 then \\L\\2", "q_sviRaw",
> perl=TRUE)
> [1] "S then viraw"
> > sub("q_([a-z])([a-zA-Z]*)", "\\U\\1 then \\2", "q_sviRaw", 
> perl=TRUE)
> [1] "S then VIRAW"
> 
> Bill Dunlap
> TIBCO Software Inc - Spotfire Division
> wdunlap tibco.com 
> 
> ----------------------------------------------------------------------
> [R] toupper does not work in sub + regex
> 
> Gabor Grothendieck ggrothendieck at gmail.com Mon Apr 13 18:26:12 CEST
> 2009
> 
> sub only handles replacement strings, not replacement functions.
> Your code is the same as:
> 
> sub("q_([a-z])[a-zA-Z]*", '\\1', "q_sviRaw")
> 
> since toupper('\\1') has no alphabetics so its just literally 
> '\\1' and
> the latter is what sub uses.
> 
> The gsubfn function in the gsubfn package can deal with replacement
> functions:
> 
> > library(gsubfn)
> > gsubfn("q_([a-z])[a-zA-Z]*", toupper, "q_sviRaw")
> [1] "S"
> 
> See the home page: http;//gsubfn.googlecode.com, vignette and 
> help page.
> 
> On Mon, Apr 13, 2009 at 11:54 AM, Tan, Richard <RTan at panagora.com>
> wrote:
> > Hi, I don't know what I am doing wrong to the toupper does not seem 
> > working in sub + regex.  The following returns 's' not the 
> upper class
> 
> > 'S' as I expect:
> >
> > sub("q_([a-z])[a-zA-Z]*",toupper('\\1'),"q_sviRaw")
> >
> > Can someone tell me where I did wrong?
> >
> > Thanks,
> > Richard
> 
> 




More information about the R-help mailing list