[R-sig-ME] general GLMM questions

Thu May 8 10:45:50 CEST 2008

----- Original Message ----- 
From: "Ben Bolker" <bolker at ufl.edu>
To: "R Mixed Models" <r-sig-mixed-models at r-project.org>
Sent: Wednesday, May 07, 2008 5:39 PM
Subject: [R-sig-ME] general GLMM questions

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Dear sig-mixed readers,
>
> ~  Some of my students and I are foolishly attempting to write a 
> review
> of GLMMs for an ecology/evolution audience.  Realizing that this is 
> a
> huge, gnarly, and not-completely-understood subject (even for the
> experts -- see fortune("mixed")), we're trying to provide as much
> non-technical background and guidance as can be squeezed into a
> reasonable sized journal article ...
>
> ~  We've run into quite a few questions we have been unable to 
> answer
> ... perhaps because no-one knows the answers, or because we're 
> looking
> in the wrong places.  We thought we might impose on the generosity 
> of
> the list: if this feels ridiculous, please just ignore this message.
> Feedback ranging from "this is true, but I don't know of a published
> source" to "this just isn't true" would be useful.  We are aware of 
> the
> deeper problems of focusing on p-values and degrees of freedom -- we
> will encourage readers to focus on estimating effect sizes and
> confidence limits -- but we would also like to answer some of these
> questions for them, if we can.
>
> ~  Ben Bolker
>
> 1. Is there a published justification somewhere for Lynn Eberly's
> ( http://www.biostat.umn.edu/~lynn/ph7430/class.html ) statement 
> that df
> adjustments are largely irrelevant if number of blocks>25 ?
>
> 2. What determines the asymptotic performance of the LRT (likelihood
> ratio test) for comparison of fixed effects, which is known to be 
> poor
> for "small" sample sizes?  Is it the number of random-effects levels
> (as stated by Agresti 2002, p. 520), or is it the number of levels 
> of
> the fixed effect relative to the total number of data points (as
> stated by Pinheiro and Bates 2000, pp. 87-89)?  (The example given 
> by
> PB2000 from Littell et al. 1996 is a test of a treatment factor with
> 15 levels, in a design with 60 observations and 15 blocks. 
> Agresti's
> statement would imply that one would still be in trouble if the 
> total
> number of observations increased to 600 [because # blocks is still
> small], where PB2000 would imply that the LRT would be OK in this
> limit.  (A small experiment with the simulate.lme() example given on
> PB2000 p. 89 suggests that increasing the sample size 10-fold with 
> the
> same number of blocks DOES make the LRT OK ... but I would need to 
> do
> this a bit more carefully to be sure.)  (Or is this a difference
> between the linear and generalized linear case?)

Intuitively, I think this will depend on the correlation between the
measurements within each block. At the one extreme, assume that all
the measurements within each block are the same, then actual sample
size would be the number of block. At the other extreme, assume that
all the
measurements within each block are random, then the sample size would
be the total number of observations.

I know some people from Hasselt University in Belgium that have worked
on the "Effective Sample Size" for mixed models; you can check at the
following presentation given in ISCB last summer
(http://www.iscb2007.gr/ppt/Wednesday/Orfeas/16.54-17.12/slides_ISCB2007.pdf)

> 3. For multi-level models (nested, certainly crossed), how would one
> count the "number of random-effects levels" to quantify the 'sample
> size' above?  With a single random effect, we can just count the
> number of levels (blocks).  What would one do with e.g. a nested or
> crossed design?  (Perhaps the answer is "don't use a likelihood 
> ratio
> test to evaluate the significance of fixed effects".)
>
> 4. Does anyone know of any evidence (in either direction) that the
> "boundary" problems that apply to the likelihood ratio test (e.g. 
> Self
> and Liang 1987) also apply to information criteria comparisons of
> models with and without random effects?  I would expect so, since 
> the
> derivations of the AIC involve Taylor expansions around the
> null-hypothesis parameters ...

I think this is an interesting question. Let

# AIC under the null
AIC.0 = -2*logLik.0 + 2npar

# AIC under the alternative
AIC.1 = -2*logLik.1 + 2(npar + 1)

then you reject when

AIC.1 < AIC.0 =>

-2*logLik.1 + 2(npar + 1) + -2*logLik.0 + 2npar < 0 =>

LRT > 2.

Now for boundary problems and according to Stram and Lee (1994,
Biometrics), LRT ~ 0.5 * chisq(0) + 0.5 * chisq(1), for which the
critical value is 1.923. Thus, it seems to work more or less ok in
this case. However, if you wanted to test wether the variance is 10,
then LRT ~ chisq(1), for which the critical value is 3.841!

> 5. It's common sense that estimating the variance of a random effect
> from a small number of levels (e.g. less than 5) should be dicey, 
> and
> that one might in this case want to treat the parameter as a fixed
> effect (regardless of its philosophical/experimental design status).
> For small numbers of levels I would expect (?) that the answers 
> MIGHT
> be similar -- among other things the difference between df=1 and
> df=(n-1) would be small.  But ... is there a good discussion of this
> in print somewhere?  (Crawley mentions this on p. 670 of 
> "Statistical
> Computing", but without justification.)
>
> lme4-specific questions:
>
> 6. Behavior of glmer: Does glmer really use AGQ, or just Laplace?
> Both?  pp. 28-32 of the "Implementation" vignette in lme4 suggest 
> that
> a Laplace approximation is used, but I can't figure out whether this
> is an additional approximation on top of the AGQ/Laplace 
> approximation
> of the integral over the random effects used in "ordinary" LMM. 
> When
> I fit a GLMM with the different methods, the fitted objects are not
> identical but all the coefficients seem to be.  (I have poked at the
> code a bit but been unable to answer this question for myself
> ... sorry ...)

Well, in Linear Mixed Models the integral over the random effects can
be analytically evaluated and thus no approximation (i.e., AGQ or
Laplace) is required. In GLMMs this is not the case and thus the
log-likelihood needs to be calculated approximately. One method for
approximating the integral is the AGQ, and in fact Laplace is AGQ with
one quadrature point.

AFAIK (but Doug can correct if I'm wrong), glmer() uses Laplace since
AGQ is not yet implemented.

> (The glmmML package claims to fit via Laplace or Gauss-Hermite
> quadrature (with non-adaptive, but adjustable, number of quad points
> - -- so it's at least theoretically possible?)
>
> library(lme4)
> set.seed(1001)
> f = factor(rep(1:10,each=10))
> zb = rnorm(1:10,sd=2) ## block effects
> x = runif(100)
> eta = 2*x+zb[f]+rnorm(100)
> y = rpois(100,exp(eta))
>
> g1 = glmer(y~x+(1|f),family="poisson",method="Laplace")
> g2 = glmer(y~x+(1|f),family="poisson",method="AGQ")
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.6 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iD8DBQFIIc1Lc5UpGjwzenMRAr4uAJ90myt79pJZCa1a801FkxHRnAHYdgCfUYy+
> P0ljXHs4lt8aTwpWKncRkBg=
> =nd22
> -----END PGP SIGNATURE-----
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 

Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm