[R-sig-ME] Teaching Mixed Effects
a.beckerman at sheffield.ac.uk
Tue Jan 20 12:47:22 CET 2009
Dear R-Mixed people -
I am about to embark on a day of attempting to teach some aspects of
mixed models using R to PhD students. I was wondering if anyone would
be willing to indulge in this summary below, developed through reading
threads on R-Mixed and R-Help over the past few months, and vet my
list of issues/questions/topics (4) associated with mixed models?
Let me reduce any rising blood pressure by saying that I understand
(possibly) and accept why there are no p-values in lmer, and NONE of
the comments/questions below are about why lmer does not produce
sensible df's and p-values to calculate significance (Phew).
First, a technical question:
Based on these two threads:
IS mcmcsamp() broken for "complicated" random effects? Is it in good
enough shape to teach for "simple" gaussian mixed models, to
demonstrate the principle?
Now, here is what I am possibly going to talk about.....
0) Rule number 1 is to design experiments well, and aim for
orthogonal, well replicated and balanced designs. If you get data
that conforms to all of that, old school F-ratio's CAN be used. If
not, see 1-4 below (we will assume that Rule number 1 will be broken).
1) It is agreed that the Laplacian methods for estimating terms and
"likelihoods" in mixed effects models is considered most reliable
(accurate and precise). R (lme4) and ADMB model builder use these
methods. SAS nlmixed does, but SAS proc mixed does not appear to.
STATA can. Genstat does/can (see final note below**).
2) It is agreed that the appropriate test for fixed effects in mixed
models should be between nested models. However, there is no
agreement as how to characterise the distributions that would be used
to generate p-values. This is the crux of the Bates et al argument:
Likelihood Ratio Tests, Wald tests etc all need to assume a
distribtion and some degrees of freedom. But, in many mixed models,
the distribution need not conform to any of our standard ones (t,F,
Chi-square etc), especially when the number of subjects in the random
effects is small. Moreover, the relationship between fixed and random
effects means that it is nearly impossible, and perhaps not worthwhile
to calcuate what might be appropriate "degrees of freedom".
2.1) However, Bates et al have mentioned the restricted likelihood
ratio test. There is a package in R implementing some of these tools
(RLRsim), but these appear to be limited to and or focused on tests of
2.2) What some "other" packages do: SAS can produce wald tests and
LRT's if you want, and can implement the kenward-rogers adjustement.
There is some theory behind the K-R, but it is still not dealing with
the crux of the problem (see 2). Genstat uses wald tests and warns
you that with small numbers of subjects, these are not reliable.
Genstat is also experimenting with HGLM by Nelder and Lee (see **)
2.3) "Testing" random effects is considered inappropriate (but see 2.1
3) Because of 2, there is the resounding argument that bayesian and or
simulation/bootstrapping tools be used to evalaute fixed effects.
Current methods proposed and coded, but in various states of
mcmcsamp() and HPDinterval() from lme4 + baayen *.fnc's from languageR,
BUGS and winBugs,
RH Baayen's simulation tools (e.g. page 307 method)
Andrew Gelman and Jennifer Hill's tools (e.g. sim() method from
Ben Bolker's suggestions in this list for glmm's (thread: https://stat.ethz.ch/pipermail/r-sig-mixed-models/2008q4/001459.html)
3.1) These all evalaute "simple" tests of whether beta's and intercept
are different than 0, and are linked to the contrasts. There is no
emerging method equivalent to a LRT (but see 2.1 and **Final Note
4) Andrew Gelman et al also suggest AIC based methods and model
averaging for model inference, given constant random effects. I think
their argument about AIC is that if the "likelihood" is estimated
well, relative differences in AIC will be constant, irrespective of
any adjustement made to numbers of paramters used in calculating AIC:
i.e. as long as the random effects structure stays the same, the
relative differences between nested models will not change if the
number of paramters is estimated consistently. These methods still do
not produce p-values.
**Final Note Below - I have noticed a relative lack of discussion of
Nelder et al's H-likelihood and their methods to generate a general
method for all heirarchical modelling (HGLM?!). Would anybody be able
to comment? A recent paper (http://www.springerlink.com/content/17p17r046lx4053r/fulltext.pdf
) that is somewhat beyond my skills, indicates the use of Laplace
methods to estimate likelihoods in heirarchical models and various
capacity for model inference.
Thanks again, in advance, to anyone who took this on..... apologies
for any glaring errors or assignment of ideas to people incorrectly.
Dr. Andrew Beckerman
Department of Animal and Plant Sciences, University of Sheffield,
Alfred Denny Building, Western Bank, Sheffield S10 2TN, UK
ph +44 (0)114 222 0026; fx +44 (0)114 222 0002
More information about the R-sig-mixed-models