[R] grep searching for sequence of 3 consecutive upper case letters
Peter Dalgaard
p.dalgaard at biostat.ku.dk
Mon Nov 6 23:51:35 CET 2006
"Lapointe, Pierre" <Pierre.Lapointe at nbf.ca> writes:
> Hello,
>
> I need to identify all elements which have a sequence of 3 consecutive upper
> case letters, anywhere in the string.
>
> I tested my grep expression on this site: http://regexlib.com/RETester.aspx
>
> But when I try it in R, it does not filter anything.
>
> str <-c("AGH", "this WOUld be good", "Not Good at All")
> str[grep('[A-Z]{3}',str)] #looking for a sequence of 3 consecutive upper
> case letters
>
> [1] "AGH" "this WOUld be good" "Not Good at All"
>
> Any idea?
There are multiple versions of RE's, and fine details resolve in
different ways. Don't expect the RETester to hold the Final Truth; it
seems to relate to a particular programming environment, which is not
R.
> grep('[A-Z]{3}', str, perl=TRUE)
[1] 1 2
Not only that, but
> grep('[ABCDEFGHIJKLMNOPQRSTUVWXYZ]{3}', str)
[1] 1 2
Hint: What is your collating sequence?
> Sys.setlocale("LC_COLLATE", "C")
[1] "C"
> grep('[A-Z]{3}', str)
[1] 1 2
--
O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help
mailing list