[R-sig-ME] Teaching Mixed Effects

Tue Jan 20 12:47:22 CET 2009

Dear R-Mixed people -

I am about to embark on a day of attempting to teach some aspects of  
mixed models using R to PhD students.  I was wondering if anyone would  
be willing to indulge in this summary below, developed through reading  
threads on R-Mixed and R-Help over the past few months, and vet my  
list of issues/questions/topics (4)  associated with mixed models?

Let me reduce any rising blood pressure by saying that I understand  
(possibly) and accept why there are no p-values in lmer, and NONE of  
the comments/questions below are about why lmer does not produce  
sensible df's and p-values to calculate significance (Phew).

#######################

First, a technical question:

Based on these two threads:
https://stat.ethz.ch/pipermail/r-sig-mixed-models/2008q4/001459.html
https://stat.ethz.ch/pipermail/r-sig-mixed-models/2008q4/001456.html

IS mcmcsamp() broken for "complicated" random effects? Is it in good  
enough shape to teach for "simple" gaussian mixed models, to  
demonstrate the principle?

#######################

Now, here is what I am possibly going to talk about.....

0) Rule number 1 is to design experiments well, and aim for  
orthogonal, well replicated and  balanced designs.  If you get data  
that conforms to all of that, old school F-ratio's CAN be used.  If  
not, see 1-4 below (we will assume that Rule number 1 will be broken).

1) It is agreed that the Laplacian methods for estimating terms and  
"likelihoods" in mixed effects models is considered most reliable  
(accurate and precise). R (lme4) and ADMB model builder use these  
methods. SAS nlmixed does, but SAS proc mixed does not appear to.   
STATA can.  Genstat does/can (see final note below**).

2) It is agreed that the appropriate test for fixed effects in mixed  
models should be between nested models.  However, there is no  
agreement as how to characterise the distributions that would be used  
to generate p-values.  This is the crux of the Bates et al argument:  
Likelihood Ratio Tests, Wald tests etc all need to assume a  
distribtion and some degrees of freedom.  But, in many mixed models,  
the distribution need not conform to any of our standard ones (t,F,  
Chi-square etc), especially when the number of subjects in the random  
effects is small.  Moreover, the relationship between fixed and random  
effects means that it is nearly impossible, and perhaps not worthwhile  
to calcuate what might be appropriate "degrees of freedom".

2.1) However, Bates et al have mentioned the restricted likelihood  
ratio test.  There is a package in R implementing some of these tools  
(RLRsim), but these appear to be limited to and or focused on tests of  
random effects.

2.2) What some "other" packages do: SAS can produce wald tests and  
LRT's if you want, and can implement the kenward-rogers adjustement.   
There is some theory behind the K-R, but it is still not dealing with  
the crux of the problem (see 2).  Genstat uses wald tests and warns  
you that with small numbers of subjects, these are not reliable.  
Genstat is also experimenting with HGLM by Nelder and Lee (see **)

2.3) "Testing" random effects is considered inappropriate (but see 2.1  
for methods?).

3) Because of 2, there is the resounding argument that bayesian and or  
simulation/bootstrapping tools be used to evalaute fixed effects.   
Current methods proposed and coded, but in various states of  
usefulness are:

mcmcsamp() and HPDinterval() from lme4 + baayen *.fnc's from languageR,
BUGS and winBugs,
RH Baayen's simulation tools (e.g. page 307 method)
Andrew Gelman and Jennifer Hill's tools (e.g. sim() method from  
package arm)
Ben Bolker's suggestions in this list for glmm's (thread: https://stat.ethz.ch/pipermail/r-sig-mixed-models/2008q4/001459.html)

3.1) These all evalaute "simple" tests of whether beta's and intercept  
are different than 0, and are linked to the contrasts.  There is no  
emerging method equivalent to a LRT (but see 2.1 and **Final Note  
Below).

4) Andrew Gelman et al also suggest AIC based methods and model  
averaging for model inference, given constant random effects.  I think  
their argument about AIC is that if the "likelihood" is estimated  
well, relative differences in AIC will be constant, irrespective of  
any adjustement made to numbers of paramters used in calculating AIC:  
i.e. as long as the random effects structure stays the same, the  
relative differences between nested models will not change if the  
number of paramters is estimated consistently. These methods still do  
not produce p-values.

**Final Note Below - I have noticed a relative lack of discussion of  
Nelder et al's  H-likelihood and their methods to generate a general  
method for all heirarchical modelling (HGLM?!).  Would anybody be able  
to comment?  A recent paper (http://www.springerlink.com/content/17p17r046lx4053r/fulltext.pdf 
) that is somewhat beyond my skills, indicates the use of Laplace  
methods to estimate likelihoods in heirarchical models and various  
capacity for model inference.

Thanks again, in advance, to anyone who took this on..... apologies  
for any glaring errors or assignment of ideas to people incorrectly.

Andrew

---------------------------------------------------------------------------------
Dr. Andrew Beckerman
Department of Animal and Plant Sciences, University of Sheffield,
Alfred Denny Building, Western Bank, Sheffield S10 2TN, UK
ph +44 (0)114 222 0026; fx +44 (0)114 222 0002
http://www.beckslab.staff.shef.ac.uk/

http://www.flickr.com/photos/apbeckerman/
http://www.warblefly.co.uk