[R] NAs - NAs are not allowed in subscripted assignments

Nordlund, Dan (DSHS/RDA) NordlDJ at dshs.wa.gov
Thu Jul 24 19:39:34 CEST 2008


> -----Original Message-----
> From: Gabor Csardi [mailto:csardi at rmki.kfki.hu] 
> Sent: Thursday, July 24, 2008 9:59 AM
> To: Nordlund, Dan (DSHS/RDA)
> Cc: r-help at r-project.org
> Subject: Re: [R] NAs - NAs are not allowed in subscripted assignments
> 
> On Thu, Jul 24, 2008 at 09:30:54AM -0700, Nordlund, Dan 
> (DSHS/RDA) wrote:
> [...]
> > > > a <- c(rep(seq(1,4),4),NA,NA)
> > > > b <- c(rep(seq(1,2),7),NA,NA,1,2)
> > > 
> > > Andreas,
> > > 
> > > what is wrong with 
> > > 
> > > a[ (a < 2 | a > 3) & b==1 ] <- NA
> > > 
> > > ? Isn't this what you want?
> > > 
> [...]
> > 
> > As I mentioned in my response to this thread, there are 
> some things I don't quite understand with logical indexing.  
> Using the above example,
> > 
> > > a[ (a < 2 | a > 3) & b==1 ]
> > 
> > returns
> > 
> > [1]  1  1  1  1 NA NA
> > 
> > Where do the NA values come from?
> 
> This is not really about logical indexing, just operations on numeric 
> and logical vectors, and how they handle NA values. Just keep in mind 
> that NA means that we don't know the actual value. All operations 
> were desinged (I believe) with this in mind.
> Here is some help:
> 
> > NA == 1
> [1] NA
> > class(NA == 1)
> [1] "logical"
> 
> This is NA, obviously, as _we don't know_ whether NA is equal 
> to 1 or not.
> 
> > TRUE & NA
> [1] NA
> > FALSE | NA
> [1] NA
> 
> The same applies here, for the result we would need to know whether 
> NA is TRUE or FALSE. However, we have
> 
> > FALSE & NA
> [1] FALSE
> > TRUE | NA
> [1] TRUE
> 
> In these cases the result can be calculated without knowing what 
> actually NA is. 
> 
> Logical indexing is simple, for every TRUE value in the logical vector
> we choose the corresponding element from the indexed vector. If we 
> index with NA, then the chosen element is NA as well.
> 
> > (1:5)[ c(T,T,T,T,T) ]
> [1] 1 2 3 4 5
> > (1:5)[ c(T,T,T,F,T) ]
> [1] 1 2 3 5
> > (1:5)[ c(T,T,T,F,NA) ]
> [1]  1  2  3 NA
> > (1:5)[ c(NA,T,T,F,NA) ]
> [1] NA  2  3 NA
> 
> Does this help? Best,
> Gabor
> 

Yes, it does help.  I was misunderstanding how logical values are used for indexing.  I assumed incorrectly that a value would be returned only if the index expression evaluated as TRUE.  It would seem that the philosophy is that not returning a value would imply that the expression evaluated to FALSE.  So, indexing with NA must return something, and NA is the appropriate value to return since one doesn't know what it is.

Is that a reasonable summary?

Dan 

Daniel J. Nordlund
Washington State Department of Social and Health Services
Planning, Performance, and Accountability
Research and Data Analysis Division
Olympia, WA  98504-5204
 
 



More information about the R-help mailing list