[Rd] Surprising behavior of letters[c(NA, NA)]

Duncan Murdoch murdoch.duncan at gmail.com
Fri Dec 17 16:37:21 CET 2010


On 17/12/2010 10:18 AM, Gabor Grothendieck wrote:
> On Fri, Dec 17, 2010 at 9:58 AM, Duncan Murdoch
> <murdoch.duncan at gmail.com>  wrote:
> >  On 17/12/2010 9:32 AM, Gabor Grothendieck wrote:
> >>
> >>  Consider this:
> >>
> >>  >    letters[c(2, 3)]
> >>  [1] "b" "c"
> >>  >    letters[c(2, NA)]
> >>  [1] "b" NA
> >>  >    letters[c(NA, 3)]
> >>  [1] NA  "c"
> >>  >    letters[c(NA, NA)]
> >>    [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
> >>  NA NA
> >>  [26] NA
> >>
> >>  The result is a 2-vector in each case until we get to c(NA, NA) and
> >>  then it unexpectedly changes from returning a 2-vector to returning a
> >>  26-vector.  I think most people would have expected that the answer
> >>  would be c(NA, NA).
> >>
> >
> >  This is because  c(NA, NA) is a logical vector, so it gets recycled to the
> >  length of letters, whereas c(NA, 3) and the others are numeric vectors, so
> >  they aren't recycled, they're converted to integer indices.  So the surprise
> >  is due to not recognizing that NA is logical.  You wouldn't expect a length
> >  1 result from letters[TRUE], would you?
>
> One tends not to distinguish between logical NA's and integer NA's.
> In fact R represents both of them as NA on output so this does  seem
> highly error prone.
>
> >  NA # logical
> [1] NA
> >  NA_integer_ # integer
> [1] NA
>

I agree it's error prone, but I don't know a good solution.  The ability 
to distinguish them on input is a relatively recent addition (in 
2.5.0).  Changing the display on output would confuse a lot of people.

Duncan Murdoch



More information about the R-devel mailing list