[R] Using grep

Wed Oct 8 18:43:06 CEST 2008

On 08-Oct-08 15:19:02, mentor_ wrote:
> Hi,
> I have a vector A with (200, 201, 202, 203, 204, ... 210) and
> a vector B with (201, 204, 209).
> Now I would like to get the position in vector A matches with
> the entries in vector B
> So what I want to have is the following result:
> [1] 2 5 10

First of all:

  A <- (200:210)
  B<-c(201, 204, 209)
  A
# [1] 200 201 202 203 204 205 206 207 208 209 210
  B
# [1] 201 204 209
  which(A %in% B)
# [1]  2  5 10

as desired.

> I tried the following:
> grep(B, A)
> 
> grep(c(B), A)
> 
> A <- as.character(A)
> B <- as.character(B)
> 
> grep(B, A)
> grep(c(B), A)
> 
> and several other combinations. But nothing is giving me the right
> result?!
> Does anyone know why?

In grep(pattern,x,...):
pattern: character string containing a regular expression
         (or character string for 'fixed = TRUE') to be matched
         in the given character vector. Coerced by 'as.character'
         to a character string if possible.

 x, text: a character vector where matches are sought, or an
          object which can be coerced by 'as.character' to a
          character vector.

you can clearly have 'x' as a vector of character strings, so
your as.character(A) is valid for 'x'.

But as.character(B) is neither a regular expression nor a
character string -- it is a vector of character strings.
So it is not valid for 'pattern'.
What seems to happen here is that grep() takes the first
element ("201") of as.character(B), and uses this as the
regular expression (or character string):

  grep("A",c("ABC","BCD","CDE","EAB"))
# [1] 1 4
# (as expected)

  grep(c("A","B"),c("ABC","BCD","CDE","EAB"))
# [1] 1 4
# (the same as the previous; "B" in c("A","B") is ignored)

Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 08-Oct-08                                       Time: 17:43:03
------------------------------ XFMail ------------------------------