[R] example to demonstrate benefits of poly in regression?
William Dunlap
wdunlap at tibco.com
Mon Apr 1 20:09:23 CEST 2013
Here is an example of the numerical instability that poly() can help with. Imagine a 100-minute experiment where you chose to record the times as POSIXt objects (stored as seconds since 1970). Using poly() gives the right predicted values, but using I(x^2) does not:
> d <- data.frame(Date=as.POSIXct("2013-04-01")+60*(1:100), yQuadratic=(function(x)5*x-4*x^2)( (1:100)/50 ))
> d$NumDate <- as.numeric(d$Date)
> newD <- d[c(10, 80, 95),]
> cbind(newD, PQuadratic=predict(lm(yQuadratic~NumDate+I(NumDate^2),data=d),newdata=newD))
Date yQuadratic NumDate PQuadratic
10 2013-04-01 00:10:00 0.84 1364800200 2.1312
80 2013-04-01 01:20:00 -2.24 1364804400 -2.1808
95 2013-04-01 01:35:00 -4.94 1364805300 -3.1048
Warning message:
In predict.lm(lm(yQuadratic ~ NumDate + I(NumDate^2), data = d), :
prediction from a rank-deficient fit may be misleading
> cbind(newD, PQuadratic=predict(lm(yQuadratic~poly(NumDate,2),data=d),newdata=newD))
Date yQuadratic NumDate PQuadratic
10 2013-04-01 00:10:00 0.84 1364800200 0.84
80 2013-04-01 01:20:00 -2.24 1364804400 -2.24
95 2013-04-01 01:35:00 -4.94 1364805300 -4.94
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> Of Paul Johnson
> Sent: Monday, April 01, 2013 10:21 AM
> To: R-help
> Subject: [R] example to demonstrate benefits of poly in regression?
>
> Here's my little discussion example for a quadratic regression:
>
> http://pj.freefaculty.org/R/WorkingExamples/regression-quadratic-1.R
>
> Students press me to know the benefits of poly() over the more obvious
> regression formulas.
>
> I think I understand the theory on why poly() should be more numerically
> stable, but I'm having trouble writing down an example that proves the
> benefit of this.
>
> I am beginning to suspect that because R's lm uses a QR decomposition under
> the hood, the instability that we used to see when we inverted (X'X) is no
> longer so serious as it used to be. When the columns in X are
>
> x1 x2 x3 x3squared
>
> then the QR decomposition is doing about the same thing as we would do if
> we replaced x3 and x3squared by the poly(x3, 2) results.
>
> Or not. I'm not arguing that, I'm thinking out loud, hoping you'll correct
> me.
>
> pj
>
>
> --
> Paul E. Johnson
> Professor, Political Science Assoc. Director
> 1541 Lilac Lane, Room 504 Center for Research Methods
> University of Kansas University of Kansas
> http://pj.freefaculty.org http://quant.ku.edu
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list