[R-sig-ME] Raw and orthogonal polynomials differences in results
Steven McKinney
smckinney at bccrc.ca
Fri Jul 16 21:04:09 CEST 2010
Hi Rebecca
The high correlation between the linear and
quadratic terms in the raw polynomial
form (-0.993) is a warning sign that
raw polynomial fits to this data will be
numerically challenging (due to numerical
precision issues involved in inverting
matrices with nearly linearly-dependent
columns etc.).
In such situations orthogonal polynomial
fits are essential. (Notice the lack of correlation
of the orthogonal polynomial regressors - all
close to zero.) Really, orthogonal polynomial
fits are always the best approach. When
"raw" polynomial predictor variables show
low correlation, the model may be easier to
interpret or explain in some circumstances, but the
polynomial fit should always be checked to confirm that
there are no numerical stability issues involved
with the model fit estimates.
The polynomial fit is indeed the one to interpret,
and it shows no evidence of a quadratic component.
You report that no quadratic component appears
in plots of the data, so that concurs with the
(numerically more stable) orthogonal polynomial
fit results.
Mean-centred variables often perform better
in "raw" polynomial fits, so if you form a new
variable
F2dat$Av_alt_m <- F2dat$Av_alt - mean(F2dat$Av_alt, na.rm = TRUE)
then a fit of
total_RO ~ Av_alt_m + I(Av_alt_m^2) + (1 | Block) + (1 | fam:Type)
may show lower correlation between the linear and quadratic
predictors.
The orthogonal polynomial fit is the safest fit with
which to statistically assess the functional form of
the association of the response with the predictor.
Other basis spline fits and smoothers are also useful.
Steven McKinney, Ph.D.
Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre
email: smckinney +at+ bccrc +dot+ ca
tel: 604-675-8000 x7561
BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C.
V5Z 1L3
Canada
> -----Original Message-----
> From: r-sig-mixed-models-bounces at r-project.org [mailto:r-sig-mixed-
> models-bounces at r-project.org] On Behalf Of Rebecca Ross
> Sent: July-16-10 7:03 AM
> To: r-sig-mixed-models at r-project.org
> Subject: [R-sig-ME] Raw and orthogonal polynomials differences in
> results
>
> Dear List,
>
> I apologise if this has been covered before, but I haven't been able to
> find an exact answer to my question and I am hoping someone can help or
> advise. I am fitting an lmer model and would like to know if there is
> evidence for curvature or not in the regression, and if there is
> curvature, to plot the predictions and see what it looks like. I had
> been using orthogonal polynomials (poly default), but I have realised
> that I get vastly different results, in terms of significance,
> depending on whether I use orthogonals or raw. I have the impression
> from reading about this that orthogonal ones are better for my case (ie
> model selection, but not to extract the equation), but I am not sure if
> that is correct, and if it means I can follow the orthogonal or the raw
> result. Model formula and model summaries are below for raw and
> orthogonal polynomals. The data 'looks' like there should be a linear
> but not quadratic effect.
> Thank you in advance,
>
> Rebecca Ross,
> Dphil Candidate,
> Oxford University
>
> > summary(avalt2POLY)
> Linear mixed model fit by REML
> Formula: total_RO ~ poly(Av_alt, 2) + (1 | Block) + (1 | fam:Type)
> Data: F2dat
> AIC BIC logLik deviance REMLdev
> 2557 2580 -1273 2562 2545
> Random effects:
> Groups Name Variance Std.Dev.
> fam:Type (Intercept) 28.621 5.3499
> Block (Intercept) 14.759 3.8417
> Residual 111.788 10.5730
> Number of obs: 329, groups: fam:Type, 194; Block, 5 Fixed effects:
> Estimate Std. Error t value
> (Intercept) 23.281 1.856 12.541
> poly(Av_alt, 2)1 -82.333 12.776 -6.444
> poly(Av_alt, 2)2 -3.875 12.781 -0.303
> Correlation of Fixed Effects:
> (Intr) p(A_,2)1
> ply(Av_,2)1 0.000
> ply(Av_,2)2 0.000 -0.002
> > summary(avalt2RAW)
> Linear mixed model fit by REML
> Formula: total_RO ~ Av_alt + I(Av_alt^2) + (1 | Block) + (1 | fam:Type)
> Data: F2dat
> AIC BIC logLik deviance REMLdev
> 2605 2627 -1296 2562 2593
> Random effects:
> Groups Name Variance Std.Dev.
> fam:Type (Intercept) 28.621 5.3499
> Block (Intercept) 14.759 3.8417
> Residual 111.788 10.5730
> Number of obs: 329, groups: fam:Type, 194; Block, 5 Fixed effects:
> Estimate Std. Error t value
> (Intercept) 3.709e+01 1.024e+01 3.623
> Av_alt -6.827e-03 1.444e-02 -0.473
> I(Av_alt^2) -1.479e-06 4.877e-06 -0.303 Correlation of Fixed Effects:
> (Intr) Av_alt
> Av_alt -0.974
> I(Av_alt^2) 0.950 -0.993
>
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
More information about the R-sig-mixed-models
mailing list