[R] can predict ignore rows with insufficient info

Peter Whiting pete at sprint.net
Wed Sep 17 01:48:24 CEST 2003


On Tue, Sep 16, 2003 at 04:31:29PM -0400, Thomas W Blackwell wrote:
> Corrected and re-named version of function:
> 
> unsupported <- function(i,y,d)  {
>    result <- rep(F, dim(d)[1])      # default return value when
>    if (is.factor(d[[i]]))           #  d[[i]] is not a factor.
>      result <- !(d[[i]] %in% unique(d[[i]][ !is.na(d[[y]]) ]))
>    result  }
> 
> tmp.1 <- lapply(seq(along=const), unsupported, "days", const)
> tmp.2 <- matrix(unlist(tmp.1[ names(const) != "days" ]), nrow=dim(const)[1])
> tmp.3 <- as.logical(as.vector(tmp.2 %*% rep(1, dim(tmp.2)[2])))
> 
> x <- predict(g, const[ is.na(const$days) & !tmp.3, ])

Here is an approach I came up with that appears to work:

predict2 <- function(g,data,...)  
{
  for(nm in names(g$xlevels)) { 
    cat(paste(nm,"\n"))
    data[[nm]]<- factor(data[[nm]],levels=g$xlevels[[nm]])
  }
  predict(g,data,...)
}

It bases its operation on refactoring each predictor using the
factor's "levels=" argument. Any element having a level not in
g$xlevels ends up as an NA, which predict correctly handles.

I'm not sure why predict doesn't do something like this by
default, but I am just a newbee.

pete




More information about the R-help mailing list