[R] Is this an artifact of using "which"?
Richard.Cotton at hsl.gov.uk
Richard.Cotton at hsl.gov.uk
Mon Apr 14 13:50:54 CEST 2008
> I used "which" to obtain a subset of values from my data.frame.
> however, I find that there is a "trace" of the values I have removed.
> Any suggestions would be greatly appreciate.
>
> Below is my data:
>
> d <- data.frame( val = 1:10,
> group = sample(LETTERS[1:5], 10, repl=TRUE) )
>
> >d
> val group
> 1 1 B
> 2 2 E
> 3 3 B
> 4 4 C
> 5 5 A
> 6 6 B
> 7 7 A
> 8 8 E
> 9 9 E
> 10 10 A
>
> ## selecting everything that is not group "A"
> d<-d[which(d$group !="A"),]
>
> > d
> val group
> 1 1 B
> 2 2 E
> 3 3 B
> 4 4 C
> 6 6 B
> 8 8 E
> 9 9 E
>
> > levels(d$group)
> [1] "A" "B" "C" "E"
The (imho) unintuitive behaviour is to do with the subsetting function
[.factor, not which. There are a couple of workarounds:
1. Call factor to recreate the levels, and get rid of "A"
factor(d$group)
2. Redefine [.factor; see dropUnusedLevels in the Hmisc package.
Regards,
Richie.
Mathematical Sciences Unit
HSL
------------------------------------------------------------------------
ATTENTION:
This message contains privileged and confidential inform...{{dropped:20}}
More information about the R-help
mailing list