[R] regular expressions : extracting numbers

Vladimir Eremeev wl2776 at gmail.com
Mon Jul 30 14:14:25 CEST 2007




GOUACHE David wrote:
> 
> Hello all,
> 
> I have a vector of character strings, in which I have letters, numbers,
> and symbols. What I wish to do is obtain a vector of the same length with
> just the numbers.
> A quick example -
> 
> extract of the original vector :
> "lema, rb 2%" "rb 2%" "rb 3%" "rb 4%" "rb 3%" "rb 2%,mineuse" "rb" "rb"
> "rb 12" "rb" "rj 30%" "rb" "rb" "rb 25%" "rb" "rb" "rb" "rj, rb"
> 
> and the type of thing I wish to end up with :
> "2" "2" "3" "4" "3" "2" "" "" "12" "" "30" "" "" "25" "" "" "" ""
> 
> or, instead of "", NA would be acceptable (actually it would almost be
> better for me)
> 

> chv<-scan(what="character",sep=" ") #then copy the text from your message
> to the clipboard and paste it to the R console
> chv
 [1] "lema, rb 2%"   "rb 2%"         "rb 3%"         "rb 4%"        
 [5] "rb 3%"         "rb 2%,mineuse" "rb"            "rb"           
 [9] "rb 12"         "rb"            "rj 30%"        "rb"           
[13] "rb"            "rb 25%"        "rb"            "rb"           
[17] "rb"            "rj, rb"       

# actual replacements :

# replace non-digits with nothing
> chv.digits<-gsub("[^0-9]","",chv)
> chv.digits
 [1] "2"  "2"  "3"  "4"  "3"  "2"  ""   ""   "12" ""   "30" ""   ""   "25"
""  
[16] ""   ""   "" 

# replace empty strings with NA
> chv.digits[chv.digits==""]<-NA
> chv.digits
 [1] "2"  "2"  "3"  "4"  "3"  "2"  NA   NA   "12" NA   "30" NA   NA   "25"
NA  
[16] NA   NA   NA  

 
-- 
View this message in context: http://www.nabble.com/regular-expressions-%3A-extracting-numbers-tf4169660.html#a11862597
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list