[R] help with gsub and grep functions

Prof Brian Ripley ripley at stats.ox.ac.uk
Mon Oct 13 18:09:24 CEST 2003


On Mon, 13 Oct 2003, Simon Fear wrote:

> Well, this works for the first one:
> 
> > sub(" \\([A-Za-z0-9_ ]*\\)", "", Names)
> 
> and from there the second one is fairly obvious I hope.
> 
> QUESTION: having recently been using Source Edit I wanted
> to write [\\w]* instead of [A-Za-z0-9_ ]* but that doesn't

space is not in \w ... so try

> sub(" \\([\\w ]*\\)", "", Names, perl=TRUE)
[1] "g 604 be-0 -p1"

However

> sub(".*\\((.*)\\)", "\\1", Names)

picks out the parenthesized part, and

sub("\\((.*)\\)", "", Names)

omits it.


> seem to work in R. ?grep points to
> ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/
> but I can't access that (server/gateway restriction). So,

Works here.

> could anyone tell me exactly what is allowed in R regular
> expressions? A URL to the POSIX standards would be useful
> too.
> 
> In fact it would be even more useful if R's particular choice
> of RE syntax, together with R's multiple backslashes, was given 
> somewhere in the R help itself ... yes I will write it if someone 
> gives me the info or points me in the right direction ...
> 
> 
> > -----Original Message-----
> > From: Martin Olivier [mailto:martinol at ensam.inra.fr]
> > Sent: 13 October 2003 15:31
> > To: r-help
> > Subject: [R] help with gsub and grep functions
> > 
> > 
> > Security Warning:
> > If you are not sure an attachment is safe to open please contact 
> > Andy on x234. There are 0 attachments with this message.
> > ________________________________________________________________
> > 
> > Hi all,
> > 
> > Let Names a vector of chatacters. For example,
> > 
> >  > Names
> > [1] "g 604 be-0 -p1 (602 matches)" "g 606 Phli-0 -p2 (517 matches)"
> > [3] "g 608 alu-0  (659 matches)"
> > 
> > I try to use gsub or grep functions for two problems :
> > 
> > 1. First, I would like to delete all the characters between 
> > parentheses.
> > [1] "g 604 be-0 -p1" "g 606 be-0 -p2"
> > [3] "g 608 be-0 -p3"
> > 
> > 2. And, I would like to extract the characters between parentheses
> > [1] "602 matches" "517 matches"
> > [3] "659 matches"
> > 
> > 
> > 
> > Any idea?
> > 
> > Best regards,
> > Olivier
> > 
> > -- 
> > 
> > -------------------------------------------------------------
> > Martin Olivier
> > INRA - Unité protéomique           LIRMM - IFA/MAB
> > 2, Place Viala                     161, rue Ada
> > 34060 Montpellier Cédex 1          34392 Montpellier Cédex 5	
> > 
> > Tel : 04 99 61 27 01               Tel : O4 67 41 86 71
> > martinol at ensam.inra.fr             martin at lirmm.fr
> > 
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> >
>  
> 
> Simon Fear
> Senior Statistician
> Syne qua non Ltd
> Tel: +44 (0) 1379 644449
> Fax: +44 (0) 1379 644445
> email: Simon.Fear at synequanon.com
> web: http://www.synequanon.com
>  
> Number of attachments included with this message: 0
>  
> This message (and any associated files) is confidential and\...{{dropped}}
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> 
> 

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list