[R] use of poly()

Thu Feb 14 01:47:15 CET 2008

On Wednesday 13 February 2008, Bill.Venables at csiro.au wrote:
> You ask
>
> 	When using continuous data in both Y and X, does the
> 	difference between "raw" and "orthagonal" polynomials
> 	have any practical meaning?
>
> Yes, indeed it does, even if X is not 'continuous'.  There are (at
> least) two practical differences:
>
> 	1. With orthogonal polynomials you are using an orthogonal
> basis, so the estimates of the regression coefficients are statistically
> independent.  This makes it much easier in model building to get an idea
> of the degree of polynomial warranted by the data.  You can usually do
> it from a single model fit.
>
> 	2. With an orthogonal polynomial basis your model matrix has, in
> principle, an optimal condition number and the numerical properties of
> the least squares fitting algorithm can be much better.  If you really
> want the raw coefficients and their standard errors, &c, you unravel
> this a bit, but why would you want to?
>
> If all you are interested in is the fitted curve, though, (and this is
> indeed the key thing, not the coefficients), then what kind of basis you
> use is pretty irrelevant.
>
> Regards,
> W.

This is exactly the kind of explanation I was looking for. Thanks!

Dylan

> Bill Venables
> CSIRO Laboratories
> PO Box 120, Cleveland, 4163
> AUSTRALIA
> Office Phone (email preferred): +61 7 3826 7251
> Fax (if absolutely necessary):  +61 7 3826 7304
> Mobile:                         +61 4 8819 4402
> Home Phone:                     +61 7 3286 7700
> mailto:Bill.Venables at csiro.au
> http://www.cmis.csiro.au/bill.venables/
>
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
> On Behalf Of Dylan Beaudette
> Sent: Thursday, 14 February 2008 6:42 AM
> To: r-help at r-project.org
> Subject: [R] use of poly()
>
> Hi,
>
> I am curious about how to interpret the results of a polynomial
> regression--
> using poly(raw=TRUE) vs. poly(raw=FALSE).
>
> set.seed(123456)
> x <- rnorm(100)
> y <- jitter(1*x + 2*x^2 + 3*x^3 , 250)
> plot(y ~ x)
>
> l.poly <- lm(y ~ poly(x, 3))
> l.poly.raw <- lm(y ~ poly(x, 3, raw=TRUE))
>
> s <- seq(-3, 3, by=0.1)
>
> lines(s, predict(l.poly, data.frame(x=s)), col=1)
> lines(s, predict(l.poly.raw, data.frame(x=s)), col=2)
>
> The results are the same, but the regression coeficients are different:
>
> as.vector(coef(l.poly))
> 1.806618 88.078858 16.194423 58.051642
>
> as.vector(coef(l.poly.raw))
> -0.1025114  1.5265248  2.0617970  2.7393995
>
>
> When using continuous data in both Y and X, does the difference between
> "raw"
> and "orthagonal" polynomials have any practical meaning?
>
> Thanks,
>
> Dylan

-- 
Dylan Beaudette
Soil Resource Laboratory
http://casoilresource.lawr.ucdavis.edu/
University of California at Davis
530.754.7341