[R] binary operators that never return missing values
Duncan Murdoch
murdoch.duncan at gmail.com
Thu Jun 21 01:04:05 CEST 2012
On 12-06-20 4:44 PM, Anthony Damico wrote:
> Hi, I work with data sets with lots of missing values. We often need
> to conduct logical tests on numeric vectors containing missing values.
> I've searched around for material and conversations on this topic,
> but I'm having a hard time finding anything. Has anyone written a
> package that deals with this sort of thing? All I want are a group of
> functions like the ones I've posted below, but I'm worried I'm
> re-inventing the wheel.. If they're not already on CRAN, I feel like
> I should add them. Any pointers to work already completed on this
> subject would be appreciated. Thanks!
>
> Anthony Damico
> Kaiser Family Foundation
>
>
>
> Here's a simple example of what I need done on a regular basis:
>
> #two numeric vectors
> a<- c( 1 , NA , 7 , 2 , NA )
>
> b<- c( NA , NA , 9 , 1 , 6 )
>
> #this has lots of NAs
> a> b
>
> #save this result in x
> x<- (a> b)
>
> #overwrite NAs in x with falses (which we do a lot)
> x<- ifelse( is.na( x ) , F , x )
>
> #now x has only trues and falses
> x
Not necessarily. F is a variable; if it happens to hold the value TRUE
or 17, then x will get that.
For your question: I think what you're doing is a bad idea. There are
certain relations that hold for ">" that just don't hold for your
function, e.g.
(a > b) is the same as !(a <= b)
(a > b) is the same as ( !(a < b) & (a != b) )
if !(a < b) and !(b < c), then !(a < c)
etc.
I think you'll find it very difficult to define the other comparison
operators in a way that doesn't lead to strange behaviour when it
violates these relations. Even if you never use any other comparisons,
your reasoning about results will end up incorrect, because these
relations are so ingrained into our psyches.
It would probably be easier to get consistency if you treated NA as -Inf
or +Inf, or just avoided the suggestive name: define foo(a,b) to return
TRUE or FALSE according to your desired rules, and don't pretend it's an
order relation.
Duncan Murdoch
More information about the R-help
mailing list