[R-sig-ME] checking overdispersion when modelling proportions

Sun Jan 29 02:44:09 CET 2012

Kristen Mancuso <kmancuso88 at ...> writes:

> 
> Hi,
> 
> When looking at the R book, it suggests one should be able to check for
> overdispersion when looking at the summary output of a mixed model with
> proportional data, but I am confused about how to assess this since it does
> not give residual deviance or degrees of freedom.  Any help on this would
> be greatly appreciated!
> 
> Here is the output I receive after fitting the model:
> 
> > model<-lmer(y~ Treatment*Type + (1|Site), family=binomial)
> > summary(model)
> Generalized linear mixed model fit by the Laplace approximation
> Formula: y ~ Treatment * Type + (1 | Site)
>    AIC   BIC logLik deviance
>  37.39 47.99 -9.694    19.39
> Random effects:
>  Groups Name        Variance  Std.Dev.
>  Site   (Intercept) 0.0023005 0.047964
> Number of obs: 24, groups: Site, 12
> 
> Fixed effects:

  I just added this to http://glmm.wikidot.com/faq -- comments welcome.

How can I test for overdispersion?

    with the usual caveats (e.g. see Venables and Ripley MASS p. 209), plus a
few extras — counting degrees of freedom, etc. — the usual procedure of
calculating the sum of squared Pearson residuals and comparing it to the
residual degrees of freedom should give at least a crude idea of overdispersion.
The following crude attempt counts each variance or covariance parameter as one
model degree of freedom and presents the sum of squared Pearson residuals, the
ratio of (SSQ residuals/rdf), the residual df, and the $$p$$-value based on the
(approximately!!) appropriate $$\chi^2$ distribution.

overdisp_fun <- function(model) {
  ## number of variance parameters in 
  ##   an n-by-n variance-covariance matrix
  vpars <- function(m) {
    nrow(m)*(nrow(m)+1)/2
  }
  model.df <- sum(sapply(VarCorr(model),vpars))+length(fixef(model))
  (rdf <- nrow(model at frame)-model.df)
  rp <- residuals(model)
  Pearson.chisq <- sum(rp^2)
  prat <- Pearson.chisq/rdf
  pval <- pchisq(Pearson.chisq, df=rdf, lower.tail=FALSE)
  c(chisq=Pearson.chisq,ratio=prat,rdf=rdf,p=pval)
}