[R] string manipulation

Gabor Grothendieck ggrothendieck at gmail.com
Fri Aug 26 14:57:09 CEST 2011


On Fri, Aug 26, 2011 at 7:27 AM, Jeff Newmiller
<jdnewmil at dcn.davis.ca.us> wrote:
> ".*" is greedy... might want regex "number[^0-9]*([0-9] {4})" to avoid
> getting 1999 from "I want the number 2000, not the number 1999."

If such inputs are possible we could also do this where we have added
a ? after the * to make the repetition non-greedy and also have used
simplify=unlist and ended it with [1] to get only the first match
since it will otherwise match and return all occurrences:

strapply(mytext, "number.*?([0-9]{4})", as.numeric, simplify = unlist)[1] # 2000

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com



More information about the R-help mailing list