[R] Inexplicably different results using subset vs bracket notation on logical variable
William Dunlap
wdunlap at tibco.com
Tue Aug 28 05:02:35 CEST 2012
subset(dataFrame, subset) does the equivalent of dataFrame[!is.na(subset) & subset,].
I.e., it treats the NA's in the subset argument the same as FALSEs. Doesn't help(subset)
mention this?
By the way, if Renewal is a logical vector, it will be identical to Renewal==TRUE so
you may as well leave off the "==TRUE".
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> Of Mauricio Cornejo
> Sent: Monday, August 27, 2012 3:09 PM
> To: r-help at r-project.org
> Subject: [R] Inexplicably different results using subset vs bracket notation on logical
> variable
>
> Hi,
>
> Would anyone have any idea as to why I would obtain completely different results when
> subsetting using the subset function vs bracket notation?
>
> I have a data frame with 65 variables and 4382 rows. When I use execute the following
> subset command I get the correct results (125 rows)
> > subset(df, Renewal==TRUE, 1:2)
>
>
> However, I tried to obtain the same results with bracket notation as follows. The output
> gave me all the rows in the data frame and not just the subset of 125 I was looking for.
> > df[df$Renewal==TRUE, 1:2]
>
> The 'Renewal' variable is of logical type and is the last (65th) variable in the data
> frame. However, values are either TRUE or NA (there are no 'FALSE' values).
>
> My attempts at replicating this with a small dummy data set, for including here, have not
> worked (i.e. I don't get an error when I use synthetic data). Any ideas on what could be
> going on?
>
> Many thanks for any insights anyone may have,
> Mauricio
>
> [[alternative HTML version deleted]]
More information about the R-help
mailing list