[R] poly(x) workaround when x has missing values
Prof Brian Ripley
ripley at stats.ox.ac.uk
Thu Jan 25 08:50:08 CET 2007
Orthpgpnality of polynomials is not defined if they contain missing
values, which seems a good enough reason to me.
Put it another way, in your solution whether the columns are orthogonal
depends on the unknown values of the NAs, and it looks like is only true
if the unknown values are all zero.
On Wed, 24 Jan 2007, Jacob Wegelin wrote:
>
> Often in practical situations a predictor has missing values, so that poly
> crashes. For instance:
>
>> x<-1:10
>> y<- x - 3 * x^2 + rnorm(10)/3
>> x[3]<-NA
>> lm( y ~ poly(x,2) )
> Error in poly(x, 2) : missing values are not allowed in 'poly'
>>
>> lm( y ~ poly(x,2) , subset=!is.na(x)) # This does not help?!?
> Error in poly(x, 2) : missing values are not allowed in 'poly'
>
> The following function seems to be an okay workaround.
>
> Poly<- function(x, degree = 1, coefs = NULL, raw = FALSE, ...) {
> notNA<-!is.na(x)
> answer<-poly(x[notNA], degree=degree, coefs=coefs, raw=raw, ...)
> THEMATRIX<-matrix(NA, nrow=length(x), ncol=degree)
> THEMATRIX[notNA,]<-answer
> attributes(THEMATRIX)[c('degree', 'coefs', 'class')]<- attributes(answer)[c('degree', 'coefs', 'class')]
> THEMATRIX
> }
>
>
>> lm( y ~ Poly(x,2) )
>
> Call:
> lm(formula = y ~ Poly(x, 2))
>
> Coefficients:
> (Intercept) Poly(x, 2)1 Poly(x, 2)2
> 209.1 475.0 114.0
>
> and it works when x and y are in a dataframe too:
>
>> DAT<-data.frame(x=x, y=y)
>> lm(y~Poly(x,2), data=DAT)
>
> Call:
> lm(formula = y ~ Poly(x, 2), data = DAT)
>
> Coefficients:
> (Intercept) Poly(x, 2)1 Poly(x, 2)2
> -119.54 -276.11 -68.24
>
> Is there a better way to do this? My workaround seems a bit awkward.
> Whoever wrote "poly" must have had a good reason for not making it deal
> with missing values?
>
> Thanks for any thoughts
>
> Jacob Wegelin
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list