[Rd] grep() and factors
Bill Dunlap
bill at insightful.com
Tue Jun 6 02:57:59 CEST 2006
On Mon, 5 Jun 2006, Marc Schwartz (via MN) wrote:
> > > > grep("[a-z]", factor(letters))
> > > numeric(0)
> >
> > I was recently surprised by this also. In addition, if
> > R's grep did support factors in this way, what sort of
> > object (factor or character) should it return when value=T?
> > I recently changed Splus's grep to return a character vector in
> > that case.
> >
> > Splus> grep("[def]", letters[26:1])
> > [1] 21 22 23
> > Splus> grep("[def]", factor(letters[26:1], levels=letters[26:1]))
> > [1] 21 22 23
> > Splus> grep("[def]", letters[26:1], value=T)
> > [1] "f" "e" "d"
> > Splus> grep("[def]", factor(letters[26:1], levels=letters[26:1]), value=T)
> > [1] "f" "e" "d"
> > Splus> class(.Last.value)
> > [1] "character"
> >
> > R does this when grepping an integer vector.
> > R> grep("1", 0:11, value=T)
> > [1] "1" "10" "11"
> > help(grep) says it returns "the matching elements themselves", but
> > doesn't say if "themselves" means before or after the conversion to
> > character.
>
> Bill,
>
> My first inclination for the return value when used on a factor would be
> the indexed factor elements where grep() would otherwise simply return
> the indices. This would also maintain the factor levels from the
> original source factor since "[".factor would normally retain these when
> drop = FALSE.
That would be my first inclination also. I would have expected the output of
grep(pattern, text, value=TRUE)
to be identical to that of
text[grep(pattern, text, value=FALSE)]
no matter what class text has.
No end users have seen this in Splus so we can change it to anything,
but we want to keep it the same as R's.
> I could be convinced either way. The concern of course being that (given
> the offlist replies I have received today) even experienced users are
> getting bitten by the current behavior versus their intuitive
> expectations, which are at least loosely supported by the documentation.
>
> HTH,
>
> Marc Schwartz
----------------------------------------------------------------------------
Bill Dunlap
Insightful Corporation
bill at insightful dot com
360-428-8146
"All statements in this message represent the opinions of the author and do
not necessarily reflect Insightful Corporation policy or position."
More information about the R-devel
mailing list