[R] Selecting groups with R
David Winsemius
dwinsemius at comcast.net
Sat Aug 22 00:33:48 CEST 2009
On Aug 21, 2009, at 6:16 PM, Don McKenzie wrote:
> dataset[dataset$Color != "BLUE",]
Will return a data.frame with Color still a factor with three levels.
>
> On 21-Aug-09, at 3:08 PM, jlwoodard wrote:
>
>>
>> I have a data set similar to the following:
>>
>> Color Score
>> RED 10
>> RED 13
>> RED 12
>> WHITE 22
>> WHITE 27
>> WHITE 25
>> BLUE 18
>> BLUE 17
>> BLUE 16
>>
>> and I am trying to to select just the values of Color that are
>> equal to RED
>> or WHITE, excluding the BLUE.
>>
>> I've tried the following:
>> myComp1<-subset(dataset, Color =="RED" | Color == "WHITE")
>> myComp1<-subset(dataset, Color != "BLUE")
>> myComp1<-dataset[which(dataset$Color != "BLUE"),]
>>
>> Each of the above lines successfully excludes the BLUE subjects,
>> but the
>> "BLUE" category is still present in my data set; that is, if I try
>> table(Color) I get
>>
>> RED WHITE BLUE
>> 82 151 0
>>
>> If I try to do a t-test (since I've presumably gone from three
>> groups to two
>> groups), I get:
>> Error in if (stderr < 10 * .Machine$double.eps * max(abs(mx),
>> abs(my)))
>> stop("data are essentially constant") :
>> missing value where TRUE/FALSE needed
>> In addition: Warning message:
>> In mean.default(y) : argument is not numeric or logical: returning NA
>>
>> and describe.by(score,Color) gives me descriptives for RED and
>> WHITE, and
>> BLUE also shows up as NULL.
>>
>> How can I eliminate the BLUE category completely so I can do a t-
>> test using
>> Color (with just the RED and WHITE subjects)?
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
More information about the R-help
mailing list