[Rd] Issue with prediction from lm object with poly
Ulrike Grömping
groemping at bht-berlin.de
Tue Aug 3 21:24:47 CEST 2010
DDear developeRs,
about a year ago, Alex Stolpovsky posted an issue with predict.lm on a
fit generated using poly with the raw=TRUE option and too few new data
(slightly modified reproducible example below). Alex did not get any
reply. I have just stumbled on the same problem, and I think that this
is a bug of function poly, which arises from the check whether the
polynomial degree is compatible with the number of unique data points.
Unfortunately, poly is the only comfortable way of automatically
guaranteeing compatibility between values for different degree
polynomials in newdata. With the I(x^2) solution for the square,
compatibility must be manually ascertained. I got stuck with this for
Russ Lenth's implementation of contour.lm in package rsm, where
incompatible linear and squared values are fixed, when using standard
quadratic terms (like x + I(x^2)) in the formula, because I(x^2) does
not remember it's relation to x. I tried to fix this by using
poly(x,2,raw=TRUE) instead of x + I(x^2), which failed because of the
issue raised by Alex Stolpovsky. I think it would be desirable to make
prediction with poly and the raw=TRUE option work. In my opinion, this
is more important than checking for admissibility of the degree of the
polynomial.
A related question: wouldn't it be more logical, if model.matrix would
return the complete model matrix with all polynomial degrees, but
model.frame would only return the original data used to generate the
poly matrix? This might be a change with massive consequences and
therefore undesirable, but ...
Best, Ulrike
******* the example that shows the issue
x <- c(1:10)
y <- sin(c(1:10))
fit <- lm(formula=y~poly(x, 5, raw=TRUE))
predict(fit, newdata=data.frame(x=c(1:10))) ## this works
predict(fit, newdata=data.frame(x=1)) ## this is broken, error below
Error in poly(x, 5, raw = TRUE) :
'degree' must be less than number of unique points
The problem is in poly():
if (raw) {
if (degree >= length(unique(x)))
stop("'degree' must be less than number of unique points")
More information about the R-devel
mailing list