[R] length() misbehaving?

Uwe Ligges ligges at statistik.uni-dortmund.de
Fri Mar 14 16:59:08 CET 2003

David Parkhurst wrote:
> I'm having a weird problem with length(), in R1.6.1 under windows2000.  I have a
> dataframe called byyr, with ten columns, the first of which is named cnd95.
> summary(byyr) shows that byyr$cnd95 contains the factor level "tr" 66 times.  Also,
> when I enter byyr$cnd95 at the command line, I can count 66 "tr" elements in the
> resulting vector.  However, when I enter
> n95trt <- length(byyr$cnd95[byyr$cnd95=="tr"])
> n95trt
> the result is 68!  Any ideas why this is happening, and how I can fix the miscount?
> (That column also contains 69 entries of "c", and (relevantly?) two NA's.)
> Thanks for any help.
> Dave Parkhurst

The result you are looking for can be calculated with

  sum(byyr$cnd95 == "tr", na.rm=TRUE)

Look at

   byyr$cnd95 == "tr"

you'll get TRUE, FALSE, and NAs
Indexing with NAs yields NAs and hence these are included in the length 
of the resulting vector.

Uwe Ligges

More information about the R-help mailing list