[R] Systematic treatment of missing values

Prof Brian Ripley ripley at stats.ox.ac.uk
Sun May 28 10:10:04 CEST 2006


You start with very general comments, but only use one specific function, 
match (see ?"%in%", a help page entitled `value matching').

Matching and equality are treated differently.  By definition, NA matches 
NA and nothing else, and NaN matches NaN and nothing else.  In 
comparisons, these values are not comparable.

As you will have seen from the help page, match() has the expansion 
capacity for declaring values non-comparable.  That has not been 
implemented for a decade and no one has supplied code to implement it, so 
it seems no want has much need of it.

I have added notes to the help pages for match and == to say explicitly 
what matches and what is comparable.  If the *Draft* R Language Definition 
were ever to be finished it would have such details: it already has a 
useful commentary.

On Sat, 27 May 2006, David Soloveichik wrote:

> I am wondering whether there is a well-accepted approach to handling
> missing values (NA's) in a programming language such as R.  For
> example, most functions seem to propagate NA to the output when the
> value of the missing entry could have mattered.  In other words, most
> functions are not willing to "take a stand" on what the missing value
> was.  However, some functions don't seem to do this.  For example,
>
> > c(1,2,3,NA) %in% c(2,3)
> [1] FALSE  TRUE  TRUE FALSE
>
> rather than: FALSE  TRUE  TRUE NA
>
>
> Also, what is the logic of the following:
> > c(1,2,3,NA) %in% c(2,3,NA)
> [1] FALSE  TRUE  TRUE  TRUE
>
> Why is the last output value TRUE?  Why does R claim that the NA on
> the left hand side of %in% is the same as the NA on the right hand
> side of %in%?

It does not: it reports that it *matches*.  Please do read the help page 
bwofre posting, as the posting guide asked you to.

> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list