[R-sig-ME] High correlation among random effects for longitudinal model

Fri Apr 6 22:53:49 CEST 2018

Dear John,

Thank you so much--will use poly in the future (even in cases in which I
might *not* use orthogonal polynomials), as it appears that the summary
function returns helpful output (i.e., correlations fo the fixed effects),
in addition to its other benefits.

When I use the poly function, the random effects correlations are lower:

Random effects:
 Formula: ~+poly(wave, 2) | student_ID
 Structure: General positive-definite, Log-Cholesky parametrization
               StdDev     Corr
(Intercept)      2.178734 (Intr) p(,2)1
poly(wave, 2)1 279.521839 0.834
poly(wave, 2)2 281.979199 0.751  0.987
Residual        16.292629

As are the fixed effects:

Fixed effects: stwm ~ +poly(wave, 2)
                   Value Std.Error   DF   t-value p-value
(Intercept)      5.30155  0.231096 7070 22.940957       0
poly(wave, 2)1 196.33516 23.481935 7070  8.361115       0
poly(wave, 2)2 113.69005 23.591988 7070  4.819011       0
 Correlation:
               (Intr) p(,2)1
poly(wave, 2)1 0.344
poly(wave, 2)2 0.311  0.516

However, when I calculate individual (i.e., group)-specific predicted
values (i.e., BLUPs, using the predict() method, with level = 1), they are
(very) highly correlated:

# A tibble: 3 x 4
  rowname   intercept linear quadratic
  <chr>         <dbl>  <dbl>     <dbl>
1 intercept    NA      0.960     0.960
2 linear        0.960 NA         1.00
3 quadratic     0.960  1.00     NA

When I calculate the same individual-specific predicted values using the
non-orthogonal (raw) polynomials, these correlations are very nearly as
high. At this point, I'm curious how / why these predictions are so highly
correlated.

Thank you again for your help, John, and for yours, Ben, Stuart, Jorg, and
Thierry. Sorry if this is a beginner or otherwise ignorant question as I
learn to work with these longitudinal / polynomial models.
josh

On Tue, Apr 3, 2018 at 1:32 PM, Fox, John <jfox at mcmaster.ca> wrote:

> Dear Joshua,
>
> I'm chiming in late, so it's possible that someone already pointed this
> out and I didn't notice. A better way to specify a polynomial in R is to
> use poly() in the model formula. By default, this produces orthogonal
> polynomial regressors (at least in the fixed effects) but the same fit to
> the data. For example,
>
> > time <- 1:5
> > X <- poly(time, 2)
>
> > X
>               1          2
> [1,] -0.6324555  0.5345225
> [2,] -0.3162278 -0.2672612
> [3,]  0.0000000 -0.5345225
> [4,]  0.3162278 -0.2672612
> [5,]  0.6324555  0.5345225
> attr(,"coefs")
> attr(,"coefs")$alpha
> [1] 3 3
>
> attr(,"coefs")$norm2
> [1]  1  5 10 14
>
> attr(,"degree")
> [1] 1 2
> attr(,"class")
> [1] "poly"   "matrix"
>
> > colSums(X)
>            1            2
> 0.000000e+00 1.110223e-16
>
> > crossprod(X)
>               1             2
> 1  1.000000e+00 -1.110223e-16
> 2 -1.110223e-16  1.000000e+00
>
> My guess is that this will also reduce the correlations among the random
> effects. If you really must have raw polynomials, then poly(time, 2,
> raw=TRUE) offers the advantage that model-structure-aware functions can
> understand that the linear and quadratic regressors are part of the same
> term in the model.
>
> Whether high correlations among the random effects are really a problem,
> my guess is that they aren't, because lme() uses a log-Cholesky
> factorization of the random-effects covariance matrix anyway. In some
> contexts, high correlations might produce numerical instability, but, as I
> said, probably not here. Ben would know.
>
> I hope this helps,
>  John
>
> -----------------------------
> John Fox, Professor Emeritus
> McMaster University
> Hamilton, Ontario, Canada
> Web: socialsciences.mcmaster.ca/jfox/
>
>
>
>
>
> > -----Original Message-----
> > From: R-sig-mixed-models [mailto:r-sig-mixed-models-
> bounces at r-project.org]
> > On Behalf Of Joshua Rosenberg
> > Sent: Monday, April 2, 2018 5:33 PM
> > To: Ben Bolker <bbolker at gmail.com>
> > Cc: R SIG Mixed Models <r-sig-mixed-models at r-project.org>
> > Subject: Re: [R-sig-ME] High correlation among random effects for
> longitudinal
> > model
> >
> > Dear Stuart and Ben,
> >
> > Thank you, this worked to significantly reduce the correlations between
> the
> > intercept and the linear and quadratic terms (though still quite high
> between the
> > linear and quadratic term):
> >
> > Random effects:
> >  Formula: ~time + I(time^2) | student_ID
> >  Structure: General positive-definite, Log-Cholesky parametrization
> >             StdDev    Corr
> > (Intercept) 18.671959 (Intr) time
> > time        11.029842 -0.262
> > I(time^2)    8.359834 -0.506  0.959
> > Residual    29.006598
> >
> > Could I ask if that correlation between the linear (time) and quadratic
> > I(time^2) terms is cause for concern - and if so, how to think about
> > (potentially) addressing this?
> > Josh
> >
> > On Sun, Apr 1, 2018 at 12:34 PM Ben Bolker <bbolker at gmail.com> wrote:
> >
> > > On Sun, Apr 1, 2018 at 12:20 PM, Stuart Luppescu <lupp at uchicago.edu>
> > > wrote:
> > > > On Sun, 2018-04-01 at 12:55 +0000, Joshua Rosenberg wrote:
> > > >> lme(outcome ~ time + I(time^2),
> > > >>     random = ~ time + I(time^2),
> > > >>     correlation = corAR1(form = ~ time | individual_ID),
> > > >>     data = d_grouped)
> > > >>
> > > >> I have a question / concerns about the random effects, as they are
> > > >> highly correlated (intercept and linear term = -.95; intercept and
> > > >> quadratic term = .96; linear term and quadratic term = -.995):
> > > >
> > > > I think this is an ordinary occurrence for the intercept and time
> > > > trend to be negatively correlated. The way to avoid this is to
> > > > center the time variable at a point in the middle of the series, so,
> > > > instead of setting the values of time to {0, 1, 2, 3, 4} use {-2,
> -1, 0, 1, 2}.
> > > >
> > >
> > >   Agreed.  This is closely related, but not identical to, the
> > > phenomenon where the *fixed effects* are highly correlated.
> > >
> > > > --
> > > > Stuart Luppescu
> > > > Chief Psychometrician (ret.)
> > > > UChicago Consortium on School Research
> > > > http://consortium.uchicago.edu
> > > >
> > > > _______________________________________________
> > > > R-sig-mixed-models at r-project.org mailing list
> > > > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> > >
> > > _______________________________________________
> > > R-sig-mixed-models at r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> > >
> > --
> > Joshua Rosenberg, Ph.D. Candidate
> > Educational Psychology & Educational Technology Michigan State
> University
> > http://jmichaelrosenberg.com
> >
> >       [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > R-sig-mixed-models at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

-- 
Joshua Rosenberg, Ph.D. Candidate
Educational Psychology & Educational Technology
Michigan State University
http://jmichaelrosenberg.com

	[[alternative HTML version deleted]]