[R] Regexp subexpression
Dieter Menne
dieter.menne at menne-biomed.de
Sat Mar 25 17:22:52 CET 2006
I can't get the PERL subexpression translated to R. Following, for example,
B. Ripley's
http://finzi.psych.upenn.edu/R/Rhelp02a/archive/58984.html
I am using sub, but it looks like an ugly substitute. Assume I want to
extract the first alpha part and the first numeric part, but only if they
are in sequence.
Do I really have to use the sub twice, first extracting the first variable,
then the second? The third example should return nothing, because it's
inverted, but it returns the whole string. I know I could check that
separately, but is there no better way?
patid=c("ALAN334","AzD44","44AZD")
txt =sub("([[:alpha:]]+)([[:digit:]])+","\\1",patid)
num =sub("([[:alpha:]]+)([[:digit:]])+","\\2",patid)
It would be nice if the following data frame would be returned:
txt num
ALAN 334
AzD 44
NA NA (or "", "", but not so nice)
Dieter
More information about the R-help
mailing list