[R] regular expression question
Dirk Eddelbuettel
edd at debian.org
Sun Jun 11 21:47:47 CEST 2006
On 11 June 2006 at 14:25, markleeds at verizon.net wrote:
| i have variables that are of type character but
| they have number characters at the end. for example :
|
| "AAL123"
| "XELB245"
| "A247"
|
| I want a command that gives me just gives me the letter characters
| for each one.
| the letter characters always start first and then the number characters come second and it never flips back to letter characters
| once the number characters start. i am using R-2.20 on
| windows Xp. Thanks. substring doesn't work because the
| length of the letter characters can vary.
> gsub("(\\d*)$","",c("AAL123", "XELB245", "A247", "FOO123BAR"), perl=TRUE)
[1] "AAL" "XELB" "A" "FOO123BAR"
>
gsub finds what is described by the first regexp [ here (\\d\*)$ --- any
sequence of digits before the end-of-line ] and applies the second regexp
[ here an empty string as we simply delete ] to the third argument.
Note
- how the $ symbol $ \b prevents it from eating the non-final digits
in the counter example FOO123BAR
- how the \d for digits needs escaped backslashes \\d
- how the * char denotes '1 or more of the preceding thingie'
Hth, Dirk
--
Hell, there are no rules here - we're trying to accomplish something.
-- Thomas A. Edison
More information about the R-help
mailing list