[R-sig-ME] general GLMM questions

Douglas Bates bates at stat.wisc.edu
Fri May 9 16:52:53 CEST 2008

On Wed, May 7, 2008 at 10:39 AM, Ben Bolker <bolker at ufl.edu> wrote:
> Hash: SHA1
> Dear sig-mixed readers,
> ~  Some of my students and I are foolishly attempting to write a review
> of GLMMs for an ecology/evolution audience.  Realizing that this is a
> huge, gnarly, and not-completely-understood subject (even for the
> experts -- see fortune("mixed")), we're trying to provide as much
> non-technical background and guidance as can be squeezed into a
> reasonable sized journal article ...
> ~  We've run into quite a few questions we have been unable to answer
> ... perhaps because no-one knows the answers, or because we're looking
> in the wrong places.  We thought we might impose on the generosity of
> the list: if this feels ridiculous, please just ignore this message.
> Feedback ranging from "this is true, but I don't know of a published
> source" to "this just isn't true" would be useful.  We are aware of the
> deeper problems of focusing on p-values and degrees of freedom -- we
> will encourage readers to focus on estimating effect sizes and
> confidence limits -- but we would also like to answer some of these
> questions for them, if we can.
> ~  Ben Bolker
> 1. Is there a published justification somewhere for Lynn Eberly's
> ( http://www.biostat.umn.edu/~lynn/ph7430/class.html ) statement that df
> adjustments are largely irrelevant if number of blocks>25 ?
> 2. What determines the asymptotic performance of the LRT (likelihood
> ratio test) for comparison of fixed effects, which is known to be poor
> for "small" sample sizes?  Is it the number of random-effects levels
> (as stated by Agresti 2002, p. 520), or is it the number of levels of
> the fixed effect relative to the total number of data points (as
> stated by Pinheiro and Bates 2000, pp. 87-89)?  (The example given by
> PB2000 from Littell et al. 1996 is a test of a treatment factor with
> 15 levels, in a design with 60 observations and 15 blocks.  Agresti's
> statement would imply that one would still be in trouble if the total
> number of observations increased to 600 [because # blocks is still
> small], where PB2000 would imply that the LRT would be OK in this
> limit.  (A small experiment with the simulate.lme() example given on
> PB2000 p. 89 suggests that increasing the sample size 10-fold with the
> same number of blocks DOES make the LRT OK ... but I would need to do
> this a bit more carefully to be sure.)  (Or is this a difference
> between the linear and generalized linear case?)
> 3. For multi-level models (nested, certainly crossed), how would one
> count the "number of random-effects levels" to quantify the 'sample
> size' above?  With a single random effect, we can just count the
> number of levels (blocks).  What would one do with e.g. a nested or
> crossed design?  (Perhaps the answer is "don't use a likelihood ratio
> test to evaluate the significance of fixed effects".)
> 4. Does anyone know of any evidence (in either direction) that the
> "boundary" problems that apply to the likelihood ratio test (e.g. Self
> and Liang 1987) also apply to information criteria comparisons of
> models with and without random effects?  I would expect so, since the
> derivations of the AIC involve Taylor expansions around the
> null-hypothesis parameters ...
> 5. It's common sense that estimating the variance of a random effect
> from a small number of levels (e.g. less than 5) should be dicey, and
> that one might in this case want to treat the parameter as a fixed
> effect (regardless of its philosophical/experimental design status).
> For small numbers of levels I would expect (?) that the answers MIGHT
> be similar -- among other things the difference between df=1 and
> df=(n-1) would be small.  But ... is there a good discussion of this
> in print somewhere?  (Crawley mentions this on p. 670 of "Statistical
> Computing", but without justification.)
> lme4-specific questions:
> 6. Behavior of glmer: Does glmer really use AGQ, or just Laplace?
> Both?  pp. 28-32 of the "Implementation" vignette in lme4 suggest that
> a Laplace approximation is used, but I can't figure out whether this
> is an additional approximation on top of the AGQ/Laplace approximation
> of the integral over the random effects used in "ordinary" LMM.  When
> I fit a GLMM with the different methods, the fitted objects are not
> identical but all the coefficients seem to be.  (I have poked at the
> code a bit but been unable to answer this question for myself
> ... sorry ...)

To answer this question I must again, I regret, distinguish between
the CRAN version of the lme4 package and the R-forge development
version of lme4.

In the R-forge version the only method for generalized linear mixed
models and for nonlinear mixed models is direct optimization of the
Laplace approximation to the deviance.  One of the Summer of Code
projects that Google has funded for the R Foundation (see
http://code.google.com/soc/2008/rf/about.html) is implementation of
the Adaptive Gauss-Hermite Quadrature approximation to the deviance.
That will be implemented in the development version (i.e. the R-forge
version) of the lme4 package.  AGQ will only be offered for models
with a single grouping factor for the random effects.

I realize that it is somewhat irritating and confusing to readers of
this list to have descriptions of the R-forge version of the package
contrasted with the CRAN version.  It is natural to expect that the
R-forge version should be the version on CRAN.  The reason that I have
not yet released the R-forge to CRAN is because of problems with the
mcmcsamp function in the R-forge version.  If I moved the R-forge
version to CRAN then code from Harald Baayen's book and probably code
from Gelman and Hill's book would no longer work.  Software versions
can be changed much more readily than can editions of a book.  I think
there is a way around the problem with mcmcsamp but I won't be able to
say for sure until it is coded and tested, which will take time.  I
don't want to predict exactly how much time - I always manage to
underestimate drastically.

I do not plan to provide an implementation of penalized
quasi-likelihood (PQL) in what is currently the development version
and what will become lme4_1.0.  PQL for GLMMs and the
"Lindstrom-Bates" algorithm for nonlinear mixed models (the approaches
are related) are examples of alternating conditional optimization
(think of the Gibbs sampler approach with optimization instead of
sampling).  It is a dangerous practice in that it can result in
oscillation between conditional optima.  I now prefer direct
optimization techniques where the fixed-effects parameters and the
variance components are optimized simultaneously.  There may be
circumstances where PQL is advantageous but usually those are because
of over-parameterized models.

> (The glmmML package claims to fit via Laplace or Gauss-Hermite
> quadrature (with non-adaptive, but adjustable, number of quad points
> - -- so it's at least theoretically possible?)

Yes.  I think the adaptive part is important (in fact, I know it is
important) and probably more important than the distinction between
the Laplace approximation and Gauss-Hermite quadrature.  That is, you
gain more from using the Laplace approximation at the conditional
modes of the random effects (which is the "adaptive" part) than
increasing the number of Gauss-Hermite quadrature points at the
marginal modes.  The tricky bit of AGQ or Laplace is determining the
conditional modes and that part is already done.
> library(lme4)
> set.seed(1001)
> f = factor(rep(1:10,each=10))
> zb = rnorm(1:10,sd=2) ## block effects
> x = runif(100)
> eta = 2*x+zb[f]+rnorm(100)
> y = rpois(100,exp(eta))
> g1 = glmer(y~x+(1|f),family="poisson",method="Laplace")
> g2 = glmer(y~x+(1|f),family="poisson",method="AGQ")
> Version: GnuPG v1.4.6 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
> iD8DBQFIIc1Lc5UpGjwzenMRAr4uAJ90myt79pJZCa1a801FkxHRnAHYdgCfUYy+
> P0ljXHs4lt8aTwpWKncRkBg=
> =nd22
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

More information about the R-sig-mixed-models mailing list