[R-sig-ME] GLMM review, revisited

Tue Jun 24 21:03:44 CEST 2008

   Dear r-sig-mixed'ers:

  thanks for all your input on our GLMM review.

  of course, a variety of other interesting questions
have come up, and I thought I would run them by you
(in addition to letting you know where you can look
at a draft if you're interested:
http://www.zoology.ufl.edu/bolker/glmm_review-24jun.pdf . I've also set 
up glmm.wikidot.com , which doesn't have
much on it yet but hopefully will have some worked examples
etc..  Would have used the R wiki but the scope
is more general, e.g. including SAS examples.)

   Now a few more general questions, having to do with
degrees of freedom, hypothesis testing, and p-values (with
apologies to those who are tired of this topic ...)

   1. I have heard ("on the street") that LR tests are preferred to 
Wald/F for testing random effects.  In the paper we just say that that's
because they make weaker assumptions.  Does anyone have a (pref.
peer-reviewed) ref. for the assertion that LR is better in this case?

   2. Supposing one decides to use Wald/F tests to test fixed effects.
The "numerator" degrees of freedom are known (1 for covariates, n-1
for factors).  It's my understanding that Wald tests ignore uncertainty
in the sd estimate (hence are analogous to Z tests), and therefore don't
need "residual df" values.  (On the other hand, Littell 2006 shows
examples using a  t test with residual df, although he does say "In 
generalized linear models it is often desirable to perform 
chi-square-based inferences instead of t− or F-based inferences", and
recall reading somewhere (??) that the df-correction to the Wald
test was not worthwhile.)  Our bottom line interpretation is that
df estimates are necessary when (1) doing Wald/F tests of random effects
[which you shouldn't do? see #1] or (2) testing fixed effects in
the presence of overdispersion (and that one should do an F test
in this case and a Wald in the absence of overdispersion).
Opinions?

   3. A random, connecting-the-dots query: I'm puzzled that
in the context of likelihood ratio testing the appropriate test
distribution is chi-squared with a mixture between 0 and 1 df
(for a single variance term), while the appropriate degrees of freedom
calculation more generally (e.g. for AIC etc.) is thought to between
1 and N-1 -- it feels like these corrections are in opposite directions 
-- i.e., the LR test of a random effect on 1 df is *conservative* 
because of boundary effects, but when we think in other contexts,
using 1 df to denote "just a single variance term" is 
*anticonservative*.  What difference in context am I missing ... ?

   [Congratulations and thanks if you read this far.]

   cheers
     Ben Bolker