[R] poly(x) workaround when x has missing values
Jacob Wegelin
jawegelin at ucdavis.edu
Thu Jan 25 02:39:39 CET 2007
Often in practical situations a predictor has missing values, so that poly
crashes. For instance:
> x<-1:10
> y<- x - 3 * x^2 + rnorm(10)/3
> x[3]<-NA
> lm( y ~ poly(x,2) )
Error in poly(x, 2) : missing values are not allowed in 'poly'
>
> lm( y ~ poly(x,2) , subset=!is.na(x)) # This does not help?!?
Error in poly(x, 2) : missing values are not allowed in 'poly'
The following function seems to be an okay workaround.
Poly<- function(x, degree = 1, coefs = NULL, raw = FALSE, ...) {
notNA<-!is.na(x)
answer<-poly(x[notNA], degree=degree, coefs=coefs, raw=raw, ...)
THEMATRIX<-matrix(NA, nrow=length(x), ncol=degree)
THEMATRIX[notNA,]<-answer
attributes(THEMATRIX)[c('degree', 'coefs', 'class')]<- attributes(answer)[c('degree', 'coefs', 'class')]
THEMATRIX
}
> lm( y ~ Poly(x,2) )
Call:
lm(formula = y ~ Poly(x, 2))
Coefficients:
(Intercept) Poly(x, 2)1 Poly(x, 2)2
209.1 475.0 114.0
and it works when x and y are in a dataframe too:
> DAT<-data.frame(x=x, y=y)
> lm(y~Poly(x,2), data=DAT)
Call:
lm(formula = y ~ Poly(x, 2), data = DAT)
Coefficients:
(Intercept) Poly(x, 2)1 Poly(x, 2)2
-119.54 -276.11 -68.24
Is there a better way to do this? My workaround seems a bit awkward.
Whoever wrote "poly" must have had a good reason for not making it deal
with missing values?
Thanks for any thoughts
Jacob Wegelin
More information about the R-help
mailing list