[R] How to get values out of a string using regular expressions?

Gabor Grothendieck ggrothendieck at gmail.com
Fri May 28 14:25:42 CEST 2010


Try this:

as.numeric(gsub("\\D", "", X))

On Fri, May 28, 2010 at 8:21 AM, Joris Meys <jorismeys at gmail.com> wrote:
> Dear all,
>
> I have a vector of filenames which begins like this :
> X <- c("OrthoP1_DNA_str.aln", "OrthoP10_DNA_str.aln",
> "OrthoP100_DNA_str.aln",
> "OrthoP101_DNA_str.aln", "OrthoP102_DNA_str.aln", "OrthoP103_DNA_str.aln",
> "OrthoP104_DNA_str.aln", "OrthoP105_DNA_str.aln", "OrthoP106_DNA_str.aln",
> "OrthoP107_DNA_str.aln")
>
> using
> grep("(\\d+)",X,perl=T,value=T)
>
> I get the complete values back. Yet, I want a vector :
>
> c(1,10,100,101,102,103,104,105,106,107)
>
> In Perl, using the brackets allows for extracting only the numbers (using a
> construct with $1 for those who know Perl).
>
> I want to do the same in R, but can't find a way of doing that without
> extensive string manipulations. Problem is that the length of the numbers
> differ, so I can't use substr.
> I tried
>> strsplit(X,"\\d+")
> [[1]]
> [1] "OrthoP"       "_DNA_str.aln"
> which gives me exactly what I want to throw away. So :
>> strsplit(X,"\\D+")
> [[1]]
> [1] ""  "1"
>
> [[2]]
> [1] ""   "10"
> gives something I can use, but it still requires a lot of list manipulation
> afterwards to get the right vector. Is there an option or a function I'm
> missing somewhere?
>
> Cheers
> Joris
>
> --
> Joris Meys
> Statistical Consultant
>
> Ghent University
> Faculty of Bioscience Engineering
> Department of Applied mathematics, biometrics and process control
>
> Coupure Links 653
> B-9000 Gent
>
> tel : +32 9 264 59 87
> Joris.Meys at Ugent.be
> -------------------------------
> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list