[R-sig-ME] overdispersion with binomial data?

Colin Wahl biowahl at gmail.com
Sat Feb 12 02:58:46 CET 2011


In anticipation of the weekend:
In my various readings(crawley, zuur, bolker's ecological models book, and
the GLMM_TREE article, reworked supplementary material and R help posts) the
discussion of overdispersion for glmm is quite convoluted by different
interpretations, different ways to test for it, and different solutions to
deal with it. In many cases differences seem to stem from the type of data
being analyzed (e.g. binomial vs. poisson) and somewhat subjective options
for which type of residuals to use for which models.

The most consistent definition I have found is overdispersion is defined by
a ratio of residual scaled deviance to the residual degrees of freedom > 1.

Which seems simple enough.
> modelB<-glmer(E ~ wsh*rip + (1|stream) + (1|stream:rip), data=ept,
family=binomial(link="logit"))
> rdev <- sum(residuals(modelBQ)^2)
> mdf <- length(fixef(modelBQ))
> rdf <- nrow(ept)-mdf
> rdev/rdf #9.7 >>1

So I conclude my model is overdispersed. The recent consensus solution seems
to be to create and add a individual level random variable to the model.

ept$obs <- 1:nrow(ept) #create individual level random variable 1:72
modelBQ<-glmer(E ~ wsh*rip + (1|stream) + (1|stream:rip) + (1|obs),
data=ept, family=binomial(link="logit"))

I take a look at the residuals which are now much smaller but are... just...
too... good... for my ecological (glmm free) experience to be comfortable
with. Additionally, they fit better for intermediate data, which, with
binomial errors is the opposite of what I would expect. Feel free to inspect
them in the attached image (if attachments work via mail list... if not, I
can send it directly to whomever is interested).

Because it looks too good... I test overdispersion again for the new model:

rdev/rdf #0.37

Which is terrifically underdispersed, for which the consensus is to ignore
it (Zuur et al. 2009).

So, for my questions:
1. Is there anything relevant to add to/adjust in my approach thus far?
2. Is overdispersion an issue I should be concerned with for binomial
errors? Most sources think so, but I did find a post from Jerrod Hadfield
back in august where he states that overdispersion does not exist with a
binary response variable:
http://web.archiveorange.com/archive/v/rOz2zS8BHYFloUr9F0Ut (though in
subsequent posts he recommends the approach I have taken by using an
individual level random variable).
3. Another approach (from Bolker's TREE_GLMM article) is to use Wald t or F
tests instead of Z or X^2 tests to get p values because they "account for
the uncertainty in the estimates of overdispersion." That seems like a nice
simple option, I have not seen this come up in any other readings. Thoughts?




Here are the glmer model outputs:

ModelB
Generalized linear mixed model fit by the Laplace approximation
Formula: E ~ wsh * rip + (1 | stream) + (1 | stream:rip)
   Data: ept
   AIC BIC logLik deviance
 754.3 777 -367.2    734.3
Random effects:
 Groups     Name        Variance Std.Dev.
 stream:rip (Intercept) 0.48908  0.69934
 stream     (Intercept) 0.18187  0.42647
Number of obs: 72, groups: stream:rip, 24; stream, 12

Fixed effects:
            Estimate Std. Error z value Pr(>|z|)
(Intercept) -4.28529    0.50575  -8.473  < 2e-16 ***
wshd        -2.06605    0.77357  -2.671  0.00757 **
wshf         3.36248    0.65118   5.164 2.42e-07 ***
wshg         3.30175    0.76962   4.290 1.79e-05 ***
ripN         0.07063    0.61930   0.114  0.90920
wshd:ripN    0.60510    0.94778   0.638  0.52319
wshf:ripN   -0.80043    0.79416  -1.008  0.31350
wshg:ripN   -2.78964    0.94336  -2.957  0.00311 **

ModelBQ

Generalized linear mixed model fit by the Laplace approximation
Formula: E ~ wsh * rip + (1 | stream) + (1 | stream:rip) + (1 | obs)
   Data: ept
   AIC   BIC logLik deviance
 284.4 309.5 -131.2    262.4
Random effects:
 Groups     Name        Variance Std.Dev.
 obs        (Intercept) 0.30186  0.54942
 stream:rip (Intercept) 0.40229  0.63427
 stream     (Intercept) 0.12788  0.35760
Number of obs: 72, groups: obs, 72; stream:rip, 24; stream, 12

Fixed effects:
            Estimate Std. Error z value Pr(>|z|)
(Intercept)  -4.2906     0.4935  -8.694  < 2e-16 ***
wshd         -2.0557     0.7601  -2.705  0.00684 **
wshf          3.3575     0.6339   5.297 1.18e-07 ***
wshg          3.3923     0.7486   4.531 5.86e-06 ***
ripN          0.1425     0.6323   0.225  0.82165
wshd:ripN     0.3708     0.9682   0.383  0.70170
wshf:ripN    -0.8665     0.8087  -1.071  0.28400
wshg:ripN    -3.1530     0.9601  -3.284  0.00102 **


Cheers,
-- 
Colin Wahl
Department of Biology
Western Washington University
Bellingham WA, 98225
ph: 360-391-9881
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ModelComp2.png
Type: image/png
Size: 59591 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20110211/97dc12bf/attachment-0003.png>


More information about the R-sig-mixed-models mailing list