[Rd] grep() and factors

Bill Dunlap bill at insightful.com
Mon Jun 5 22:45:03 CEST 2006


On Mon, 5 Jun 2006, Marc Schwartz (via MN) wrote:

> Based upon an offlist communication this morning, I am somewhat confused
> (more than I usually am on most Monday mornings...) about the use of
> grep() with factors as the 'x' argument.
>  ...
> > grep("[a-z]", letters)
>  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22
> [23] 23 24 25 26
>
> > grep("[a-z]", factor(letters))
> numeric(0)

I was recently surprised by this also.  In addition, if
R's grep did support factors in this way, what sort of
object (factor or character) should it return when value=T?
I recently changed Splus's grep to return a character vector in
that case.

   Splus> grep("[def]", letters[26:1])
   [1] 21 22 23
   Splus>  grep("[def]", factor(letters[26:1], levels=letters[26:1]))
   [1] 21 22 23
   Splus> grep("[def]", letters[26:1], value=T)
   [1] "f" "e" "d"
   Splus> grep("[def]", factor(letters[26:1], levels=letters[26:1]), value=T)
   [1] "f" "e" "d"
   Splus> class(.Last.value)
   [1] "character"

R does this when grepping an integer vector.
   R> grep("1", 0:11, value=T)
   [1] "1"  "10" "11"
help(grep) says it returns "the matching elements themselves", but
doesn't say if "themselves" means before or after the conversion to
character.

----------------------------------------------------------------------------
Bill Dunlap
Insightful Corporation
bill at insightful dot com
360-428-8146

 "All statements in this message represent the opinions of the author and do
 not necessarily reflect Insightful Corporation policy or position."



More information about the R-devel mailing list