[R] Indexing with NA as FALSE??

Barry Rowlingson B.Rowlingson at lancaster.ac.uk
Fri Jul 11 19:03:04 CEST 2003


ted.harding at nessie.mcc.ac.uk wrote:

> I know I can do it with
> t[(u==5)&(!is.na(u))]
> but in the situation I am dealing with this leads to massively
> cumbersome, typo-prone and hard-to-read code.

  You could redefine '[' or '==', but that would lead to massively 
dangerous code. Anything could happen. Anyone who writes code that 
redefines such basic stuff may need their head examined.

  I think you are going to have to work round it with the !is.na(u) 
thing, but you could wrap it up in a function:

true4sure<-function(v){v & !is.na(v)}

then

 > t[true4sure(u==5)]
[1] 5

  although perhaps you could give it a less whimsical name....

> Also, as an extra, it would be very useful if, for instance,
> t[u==NA] --> 2 4 6 8
> (I realise that working round this is less cumbersome, but even so).

  Here is a way of doing that. It redefines '=='. It will break things 
that depend on NA's remaining NA's in comparisons. Do not use this code. 
Do not even let it pollute your files. Consider it a dangerous virus:

> assign("==",function(a,b){a[is.na(a)]<-FALSE; b[is.na(b)]<-FALSE; get("==","package:base")(a,b)})

  and then you get:

> c(1,2,3,NA,NA,NA) == c(1,NA,2,NA,NA,4)
[1]  TRUE FALSE FALSE  TRUE  TRUE FALSE

> Instead of that, since NA is one of the three values TRUE, FALSE, NA
> of a logical, I'd like to be able to (a) treat NA as FALSE, (b) test
> for a match between NA (as specified by me) and NA (as the value of
> a logical variable).

  Thats what it does. Of course it has a bug/feature in that NA is now 
== to FALSE.... But then you arent going to use that code.

  Safer would be to define a new binary operator:

 > assign("%=na%",function(a,b){a[is.na(a)]<-FALSE; b[is.na(b)]<-FALSE; 
  get("==","package:base")(a,b)})

  Then you can do:

 > c(1,2,3,NA,NA,NA) %=na% c(1,NA,2,NA,NA,4)
[1]  TRUE FALSE FALSE  TRUE  TRUE FALSE

  again this has the same NA==FALSE property.

  Here's a truth table for that operator:

 > outer(c(T,F,NA),c(T,F,NA),"%=na%")
       [,1]  [,2]  [,3]
[1,]  TRUE FALSE FALSE
[2,] FALSE  TRUE  TRUE
[3,] FALSE  TRUE  TRUE

  You just need to write an operator that returns TRUE on the diagonal 
only.... Easy modification of %=na% but its late on a Friday and I have 
a poker game to attend...

  Did I say not to use my code that redefines '=='? Well dont use it. Ever.

Baz




More information about the R-help mailing list