[R] Formulae with factors that have missing values

Prof Brian Ripley ripley at stats.ox.ac.uk
Fri Oct 6 09:51:14 CEST 2000


On Fri, 6 Oct 2000, Rachel Merriman wrote:

> Hi All,
> 
> I have a formula which has a factor with NAs in it.  I wish to keep
> these in the model matrix, but the NA information is currently lost (the
> rows are kept but the NA gets converted to 0).  Any ideas as to how
> I can keep NAs in?
> 
> e.g.
> 
> junk <-
> factor(c("hi",NA,"low","low","hi","low","hi","hi","low",NA,"hi","low","hi","hi","low","hi"))
> 
> y <- c(1,2,1,2,2,2,1,2,1,1,2,2,1,1,1,2)
> 
> na.keep <- function(X){X}
> 
> myfn <- function (formula,data=sys.parent()){
>     mf <- match.call()
>     mf[[1]] <- as.name("model.frame")
>     if(is.null(mf$na.action)) mf$na.action <- as.name("na.keep")
>     mf <- eval(mf, sys.frame(sys.parent()))
>     Y <- model.extract(mf,response)
>     Terms <- attr(mf,"terms")
>     X <- model.matrix(Terms,mf)
>     X
> }
> 
> myfn(y~junk)

On my system NA gets converted to 1.200089e-306, not 0.
That looks like a bug, and S gives NA.

The question is, what do you want NA to be represented as in the
model matrix?  If you want NA to be another level of the factor,
try

junk <- factor(c("hi",NA,"low","low","hi","low","hi","hi","low",NA,
"hi","low","hi","hi", "low","hi"), exclude="")

Alternatively, you might want junk to be coded, and all the columns of the
coding set to NA.  The simplest way to get that is

contrasts(junk)[junk,, drop=F]

until we fix the bug.

As a matter of interest, what are you going to do with a model matrix
with NAs in?

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list