[Rd] Mismatches in predict(newdata)
Prof Brian Ripley
ripley at stats.ox.ac.uk
Tue Nov 11 09:38:39 MET 2003
One of the reports recently was of predict.lm misbehaving if
newdata=data.frame(x=rep(NA, 10)) was given a logical column when it had
been fitted as a numeric one.
The exact problem was because model.matrix was trying to handle a 0-level
factor (which is what that logical column got converted to by
contrasts<-). However, the problem is more general and I have added to
R-devel a layer of protection.
When model.frame is called, it adds to its terms attribute an attribute
"dataClasses", and this can be checked against the newdata argument by a
call to .checkMFClasses: see lm and predict.lm for how to do so.
Developers who use predict(newdata) may wish to add such code to their
packages. (You can use
if (!is.null(cl <- attr(Terms, "dataClasses")) &&
exists(".checkMFClasses", envir=NULL))
.checkMFClasses(cl, m)
to be backwards compatible.)
The exact nature of the `classes' is tricky because of inheritance. I have
implemented logical, ordered, factor (not ordered), numeric (not matrix),
nmatrix.n and other: nmatrix.n is a numeric matrix of n columns (as used
by poly() and bs(), for example). Let me know if you see a need for other
categories.
Brian
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel
mailing list