[R-sig-ME] Correlation of random effects in lmer

Sat Apr 28 05:27:52 CEST 2012

I wonder if the data have been scaled?  Try transforming all numeric
predictors to have zero mean and unit variance before fitting.

Cheers

Andrew

On Sat, Apr 28, 2012 at 02:39:33AM +0000, Ben Bolker wrote:
> arun <smartpink111 at ...> writes:
> 
> > I am testing a couple of logistic regression longitudinal models
> > using lmer.  I got stuck in a position where I found that the
> > correlation between random effects is 1.00 (intercept and slope
> > -model with one term for the same grouping factor), while the
> > std. deviation is very low for the slope.  Then, I tested another
> > model with more than one term for the same grouping factor.  The LRT
> > test p value is not significant.  Is it okay to keep the second
> > model in this case?
> 
>   I'm not sure I agree with your terminology here -- there are really
> two terms in both the (1+time|subject) model and the (1|subject) +
> (0+time|subject) model, they are just forced to be independent in the
> second case while the first case allows them to be correlated.
> 
> > Random effects:
> >  Groups  Name        Variance   Std.Dev. Corr  
> >  resid   (Intercept) 10.9575146 3.310214       
> >  Subject (Intercept)  0.8220584 0.906674       
> >          time         0.0041092 0.064103 1.000 
> > Number of obs: 392, groups: resid, 392; Subject, 25
> > 
> > > anova(fm2_BDat,fm1_BDat)
> > Data: Behavdat
> > Models:
> > fm2_BDat: Response3 ~ 1 + Wavelength * Start_Resp * time + (1 | resid) + 
> > fm2_BDat:     (1 | Subject_BDat) + (0 + time | Subject_BDat)
> > fm1_BDat: Response3 ~ 1 + Wavelength * Start_Resp * time + (1 | resid) + 
> > fm1_BDat:     (1 + time | Subject_BDat)
> >          Df    AIC    BIC  logLik  Chisq Chi Df Pr(>Chisq)
> > fm2_BDat 15 754.30 813.87 -362.15                         
> > fm1_BDat 16 754.24 817.78 -361.12 2.0564      1     0.1516
> 
>   This is a very common situation, and a bit of a tough call (I
> haven't seen much in the way of formal, rigorously tested guidance for
> this situation.)  What glmer is telling you is essentially that you
> really don't have enough data to fit two separate random effects
> ((intercept):Subject and time:Subject), so the fit is collapsing
> onto a linear combination of the two.  Your second model (fm2_BDat)
> is enforcing a zero correlation between the two random effects: I would
> not be surprised if the variance of one of the two were very close
> to zero (although you haven't shown us that).
> 
>   I will give several (conflicting) arguments here, I would be
> interested to hear what others have to say:
> 
>   1. You have no _a priori_ way of knowing that the correlation
> *is* exactly zero (if you had enough data, you would almost certainly
> find that there was a non-zero correlation between these terms);
> you shouldn't be doing 'sacrificial' or stepwise removal of terms.
> Keep the more complex model.
> 
>   2. You should be trying to select the 'minimal adequate' model; in
> particular, overfitting the model (including zero and/or perfectly
> correlated terms) is more likely to lead to numeric problems, so it's
> better to try to reduce the model until all the terms can be uniquely
> estimated.  (This is the approach suggested in Bolker et al 2008
> _Trends in Ecology and Evolution_)
> 
>  3.  AIC (which is looking for the best *predictive* model) very
> slightly favors the model with correlation, although it's almost a
> wash; if you were finding model-averaged predictions, they would be
> nearly a 50/50 mixture of the predictions from the two models.
> 
>  4. BIC (which is looking for the "true" model, i.e. identifying the
> correct dimensionality) favors the simpler (no-correlation) model.
> 
>  5. If you are most interested in inference on the fixed-effect terms,
> I doubt it matters -- my guess is that these two models give nearly
> identical fixed-effect estimates.
> 
>   As long as you decide what to do in a principled way (which
> approach seems to best fit the analysis you want to do?) rather
> than by selecting the one that gives you the answers you like
> best, I think either model is defensible.
> 
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

-- 
Andrew Robinson  
Deputy Director, ACERA 
Department of Mathematics and Statistics            Tel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia               (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr              Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/