[R-sig-ME] lmer vs glmmPQL

Roger Levy rlevy at ling.ucsd.edu
Wed Jun 24 19:14:58 CEST 2009

Ben Bolker wrote:
> Daniel Ezra Johnson wrote:
>> Regarding this thread, what about the method of fitting nested models
>> and using anova() to estimate a p-value.
>> For example:
>> mod2 = lmer(y ~ genotype + (1|block), family = binomial, data)
>> mod0 = lmer(y ~ (1|block), family = binomial, data)
>> anova(mod0,mod2)
>> How does that p-value (which is one value for the whole term
>> "genotype") relate to the individual coefficient p-values derived from
>> the Z-scores inside summary(mod2)?
>> Thanks,
>> Dan
>    The Z-scores are Wald tests.  The anova results are likelihood
> ratio tests.  The latter are in general more reliable but are known
> to be *un*reliable for small-sample (i.e.
> small-number-of-random-effects-levels) LMMs (Pinheiro and Bates).
> Wald tests are not *known* to be unreliable in the small-sample
> case, but I believe that is a statement of ignorance rather than
> a statement that they're OK ...

This is an interesting issue -- as Ben points out, likelihood-ratio 
tests are known to be anti-conservative for tests between linear mixed 
models differing only in fixed-effects structure.  I think the 
best-known demonstration of this can be found in Pinheiro & Bates, 2000, 
pages 87-92.  F-tests are less problematic.

For logit mixed models, presumably the results will depend a bit on how 
fitting is done because the likelihood has to be approximated.  For the 
Laplace approximation currently in use in lme4, I've done some 
simulations and generally found that likelihood-ratio tests are still 
anti-conservative; tests based on Wald Z are not as anti-conservative, 
though they don't seem quite nominal.  You can do simulations along 
these lines:

invlogit <- function(x) {
   temp <- exp(x)
L <- 10
REP <- 4
N <- 4*10
fac <- gl(L,REP)
f <- function(ignore) {
   x <- runif(N)
   r <- rnorm(L)
   y <- rbinom(N,1,invlogit(r[fac]))
   m0 <- lmer(y ~ (1 | fac), family="binomial")
   m1 <- lmer(y ~ x + (1 | fac), family="binomial")
   temp <- pnorm(fixef(m1)[2]/sqrt(vcov(m1)[2,2]))
   p.z <- 2*min(temp, 1-temp)
   p.lr <- 1-pchisq(as.numeric(2*(logLik(m1)-logLik(m0))),1)
K <- 5000
p.vals <- sapply(1:K,f)
apply(p.vals, 1, function(x) sum(x<0.05))/K
# this gave me 0.0562 for Wald Z, 0.0608 for LRT

# Is Wald Z anti-conservative?  The below says "yes" with p=0.044
2*(1-pbinom(sum(p.vals[1,]<0.05), K, 0.05))

Unlike what Ben states above, however, it's my intuition (and informal 
observation on the basis of simulations) that what determines how 
anticonservative the LRT is is not the number of random effect levels 
but rather the ratio between the degrees of freedom involved in the LRT 
to the residual degrees of freedom in the more general model.  The more 
residual degrees of freedom (basically, the more observations), the less 
anti-conservative the LRT is.  This can be assessed for a specific 
dataset by fitting a null-hypothesis model, repeatedly generating new, 
artificial responses from it, and looking at the distribution of LRTs on 
the artificial responses (along the lines of P&B 2000).


More information about the R-sig-mixed-models mailing list