[R-sig-ME] Correlation of random effects in lmer
Andrew Robinson
A.Robinson at ms.unimelb.edu.au
Sat Apr 28 05:27:52 CEST 2012
I wonder if the data have been scaled? Try transforming all numeric
predictors to have zero mean and unit variance before fitting.
Cheers
Andrew
On Sat, Apr 28, 2012 at 02:39:33AM +0000, Ben Bolker wrote:
> arun <smartpink111 at ...> writes:
>
> > I am testing a couple of logistic regression longitudinal models
> > using lmer. I got stuck in a position where I found that the
> > correlation between random effects is 1.00 (intercept and slope
> > -model with one term for the same grouping factor), while the
> > std. deviation is very low for the slope. Then, I tested another
> > model with more than one term for the same grouping factor. The LRT
> > test p value is not significant. Is it okay to keep the second
> > model in this case?
>
> I'm not sure I agree with your terminology here -- there are really
> two terms in both the (1+time|subject) model and the (1|subject) +
> (0+time|subject) model, they are just forced to be independent in the
> second case while the first case allows them to be correlated.
>
> > Random effects:
> > Groups Name Variance Std.Dev. Corr
> > resid (Intercept) 10.9575146 3.310214
> > Subject (Intercept) 0.8220584 0.906674
> > time 0.0041092 0.064103 1.000
> > Number of obs: 392, groups: resid, 392; Subject, 25
> >
> > > anova(fm2_BDat,fm1_BDat)
> > Data: Behavdat
> > Models:
> > fm2_BDat: Response3 ~ 1 + Wavelength * Start_Resp * time + (1 | resid) +
> > fm2_BDat: (1 | Subject_BDat) + (0 + time | Subject_BDat)
> > fm1_BDat: Response3 ~ 1 + Wavelength * Start_Resp * time + (1 | resid) +
> > fm1_BDat: (1 + time | Subject_BDat)
> > Df AIC BIC logLik Chisq Chi Df Pr(>Chisq)
> > fm2_BDat 15 754.30 813.87 -362.15
> > fm1_BDat 16 754.24 817.78 -361.12 2.0564 1 0.1516
>
> This is a very common situation, and a bit of a tough call (I
> haven't seen much in the way of formal, rigorously tested guidance for
> this situation.) What glmer is telling you is essentially that you
> really don't have enough data to fit two separate random effects
> ((intercept):Subject and time:Subject), so the fit is collapsing
> onto a linear combination of the two. Your second model (fm2_BDat)
> is enforcing a zero correlation between the two random effects: I would
> not be surprised if the variance of one of the two were very close
> to zero (although you haven't shown us that).
>
> I will give several (conflicting) arguments here, I would be
> interested to hear what others have to say:
>
> 1. You have no _a priori_ way of knowing that the correlation
> *is* exactly zero (if you had enough data, you would almost certainly
> find that there was a non-zero correlation between these terms);
> you shouldn't be doing 'sacrificial' or stepwise removal of terms.
> Keep the more complex model.
>
> 2. You should be trying to select the 'minimal adequate' model; in
> particular, overfitting the model (including zero and/or perfectly
> correlated terms) is more likely to lead to numeric problems, so it's
> better to try to reduce the model until all the terms can be uniquely
> estimated. (This is the approach suggested in Bolker et al 2008
> _Trends in Ecology and Evolution_)
>
> 3. AIC (which is looking for the best *predictive* model) very
> slightly favors the model with correlation, although it's almost a
> wash; if you were finding model-averaged predictions, they would be
> nearly a 50/50 mixture of the predictions from the two models.
>
> 4. BIC (which is looking for the "true" model, i.e. identifying the
> correct dimensionality) favors the simpler (no-correlation) model.
>
> 5. If you are most interested in inference on the fixed-effect terms,
> I doubt it matters -- my guess is that these two models give nearly
> identical fixed-effect estimates.
>
> As long as you decide what to do in a principled way (which
> approach seems to best fit the analysis you want to do?) rather
> than by selecting the one that gives you the answers you like
> best, I think either model is defensible.
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
--
Andrew Robinson
Deputy Director, ACERA
Department of Mathematics and Statistics Tel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/
Forest Analytics with R (Springer, 2011)
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009):
http://www.ms.unimelb.edu.au/spuRs/
More information about the R-sig-mixed-models
mailing list