[R] Inexplicably different results using subset vs bracket notation on logical variable

William Dunlap wdunlap at tibco.com
Tue Aug 28 05:02:35 CEST 2012


subset(dataFrame, subset) does the equivalent of dataFrame[!is.na(subset) & subset,].
I.e., it treats the NA's in the subset argument the same as FALSEs.  Doesn't help(subset)
mention this?

By the way, if Renewal is a logical vector, it will be identical to Renewal==TRUE so
you may as well leave off the "==TRUE".

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> Of Mauricio Cornejo
> Sent: Monday, August 27, 2012 3:09 PM
> To: r-help at r-project.org
> Subject: [R] Inexplicably different results using subset vs bracket notation on logical
> variable
> 
> Hi,
> 
> Would anyone have any idea as to why I would obtain completely different results when
> subsetting using the subset function vs bracket notation?
> 
> I have a data frame with 65 variables and 4382 rows. When I use execute the following
> subset command I get the correct results (125 rows)
> > subset(df, Renewal==TRUE, 1:2)
> 
> 
> However, I tried to obtain the same results with bracket notation as follows.  The output
> gave me all the rows in the data frame and not just the subset of 125 I was looking for.
> > df[df$Renewal==TRUE, 1:2]
> 
> The 'Renewal' variable is of logical type and is the last (65th) variable in the data
> frame.  However, values are either TRUE or NA (there are no 'FALSE' values).
> 
> My attempts at replicating this with a small dummy data set, for including here, have not
> worked (i.e. I don't get an error when I use synthetic data).  Any ideas on what could be
> going on?
> 
> Many thanks for any insights anyone may have,
> Mauricio
> 
> 	[[alternative HTML version deleted]]




More information about the R-help mailing list