[R-sig-ME] Prudent steps for overdispersion in glmer models (logit link)

Benjamin Dantzer bendantzer at gmail.com
Thu Jul 9 19:42:04 CEST 2009


Dear Mixed Modelers,

I encounter much overdispersion (dispersion parameters >13) when  
analyzing unbalanced proportion data and I'm trying to understand what  
are prudent steps for ecologists to follow when performing GLMMs  
(logit link) with overdispersion using lme4. I recognize other sources  
of information about this topic and have read widely, but much of my  
uncertainty comes from the current issue with lme4 and quasilikelihood  
(quasibinomial in my case) that is discussed elsewhere (https://stat.ethz.ch/pipermail/r-sig-mixed-models/2008q3/001404.html 
) and (https://stat.ethz.ch/pipermail/r-sig-mixed-models/2008q4/001632.html 
)

I use the following behavioral data as an example. These behavioral  
data are from 7 min focals where specific behaviors are recorded at 30  
s intervals. In addition to multivariate approaches, I try to  
determine how the proportions of specific behaviors vary across a  
season or breeding attempts using GLMMs.
	In the example below, I'm interested in how the proportion of time a  
squirrel spends eating changes seasonally. A quadratic effect is  
included for non-linearities. I first do an entirely fixed effects GLM  
to look for overdispersion and then a GLMM with random effects for  
both animal and observer (because repeated measures on animals and by  
observers).


Mac OS X, R version 2.9.0, lme4 version 0.999375-31

GLM Example to assess overdispersion:

Call:
glm (formula = cbind (No.Nest, 15 - No.Nest) ~ poly (Day, 2),  family  
= binomial (link=logit), data = focals)

Deviance Residuals:
    Min      1Q  Median      3Q     Max
-5.326  -2.910  -2.664   3.189   6.967

Coefficients:
                  	  Estimate 	Std. Error z value Pr(>|z|)
(Intercept)        -1.08320    0.02004 -54.042   <2e-16 ***
poly(Day, 2)1 -8.36175    0.59299 -14.101   <2e-16 ***
poly(Day, 2)2  4.92777    0.58027   8.492   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 14476  on 902  degrees of freedom
Residual deviance: 14179  on 900  degrees of freedom
AIC: 14362

Number of Fisher Scoring iterations: 5




GLMER Example:

Because there are repeated samples on the same animals and potentially  
observer effects, I include random effects for both animal and observer.


glmer (cbind (No.Nest, 15-No.Nest) ~ poly (Day,2) + (1|OBS) + (1|ID),  
family=binomial (link = logit), focals, verbose=TRUE)

   0:     12075.576: 0.607569 0.343693 -1.08320 -8.36175  4.92777
   1:     11752.951:  1.49913 0.795727 -1.10953 -8.37088  4.92682
   2:     11744.740:  1.47953  1.04743 -1.38218 -8.54319  4.89649
   3:     11726.917:  1.88345  1.10659 -1.40701 -8.58032  4.89063
   4:     11721.286:  2.01823  1.03093 -1.87186 -9.31820  4.75224
   5:     11718.158:  2.41774  1.44964 -1.63616 -9.95216  4.64645
   6:     11713.721:  2.26994  1.26734 -1.55596 -10.8047  4.52269
   7:     11712.202:  2.22602  1.19583 -1.59063 -11.6781  4.70304
   8:     11712.095:  2.20938  1.18793 -1.66349 -12.1621  4.17028
   9:     11711.920:  2.20456  1.20150 -1.70144 -12.1439  4.55884
  10:     11711.912:  2.20558  1.20779 -1.68434 -12.0797  4.51719
  11:     11711.912:  2.20180  1.20734 -1.68745 -12.0847  4.51349
  12:     11711.912:  2.20177  1.20903 -1.68700 -12.0926  4.51477
  13:     11711.912:  2.20155  1.20892 -1.68680 -12.0894  4.51528
  14:     11711.912:  2.20153  1.20893 -1.68678 -12.0893  4.51523

Generalized linear mixed model fit by the Laplace approximation
Formula: cbind(No.Nest, 15 - No.Nest) ~ poly(Day, 2) + (1 | OBS)  
+      (1 | ID)
    Data: focals.all.repro
    AIC   BIC logLik deviance
  11722 11746  -5856    11712
Random effects:
  Groups Name        Variance Std.Dev.
  ID         (Intercept) 4.8467   2.2015
  OBS    (Intercept) 1.4615   1.2089
Number of obs: 903, groups: ID, 125; OBS, 40

Fixed effects:
                    		Estimate 	Std. Error 	z value 	  Pr(>|z|)
(Intercept)         	     -1.6868     0.3776  -	4.467 	  7.95e-06 ***
poly(Day , 2)1 	    -12.0893    1.0519 		-11.492    < 2e-16 ***
poly(Day, 2)2  	      4.5139     0.8527     	5.294 	  1.20e-07 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
             (Intr) p(RD,2)1
ply(RpD,2)1  0.000
ply(RpD,2)2 -0.013 -0.027



Because of the current issue with quasi- and lme4 (see links above),  
am I basically restricted to either 1) dropping random effects and  
using quasibinomial with GLM, or 2) acknowledging the presence of  
overdispersion but arguing that much of this is due to heterogeneity  
across individuals and observers? In other examples I frequently get  
std. devs. of random effects nearly as large as the estimates of the  
fixed effects (as in the example in Bolker et al., 2008)? There are no  
high leverage outlying observations for the fixed or random effects  
and including additional covariates doesn't significantly decrease  
dispersion parameter.

Looking forward to your opinions.


-Ben Dantzer

__________________________________
Ben Dantzer
PhD Candidate

Ecology, Evolutionary Biology, and Behavior Program
Department of Zoology
203 Natural Science Building
Michigan State University
East Lansing, MI 48824-115

Phone: 	517-432-5555
Fax:	 	517-432-2789
Web:	http://www.msu.edu/~dantzer
		http://www.redsquirrel.msu.edu




More information about the R-sig-mixed-models mailing list