[R-sig-ME] testing for overdispersion in the mixed effects logistic regression
Ben Bolker
bbolker at gmail.com
Wed Jan 23 17:40:56 CET 2013
Antonio P. Ramos <ramos.grad.student at ...> writes:
>
> Hi all,
>
> I would like to check whether I am properly checking for overdispersion in
> a mixed effects logistic regression. Here is the code:
>
> m1 <- lmer(mortality.under.2 ~ maternal_age_d +B0 + B4 + V102 + BORD +
> WLTHIND5+ time + new_time + democracy + V106 +
> WLTHIND5*time + WLTHIND5*new_time + democracy*WLTHIND5 +
> (1 |CASEID)-1,data=brazil1,family=binomial(link="logit"))
>
> # testing for overdispersion
> n <- length(residuals(m1)) # the number of obs
> k <- length(unique(brazil1$CASEID)) # the number of random effects
> z <- sum(residuals(m1,type=standardize)^2) # squared, standardize residuals
> cat("the overdispersion factor is",z/(n-k),"\n")
> cat("the p-value of the overdispersion test is", pchisq(z,n-k))
>
If your response variable is binary (as I'm guessing it is since
you appear to have a single response and not something of the form
cbind(success,failure), but I can't be sure), then testing for
overdispersion doesn't make any sense -- any among-observation
variance is unidentifiable.
In any case:
* type=standardize is unlikely to work, the normal expectation
is that type should be a string (e.g. see ?residuals.lme in the nlme
package); however, the residuals method
in the stable version of lme4 doesn't even take a "type" argument
getMethod("residuals","mer")@.Data
function (object, ...)
napredict(attr(object at frame, "na.action"), object at resid)
<environment: namespace:lme4.0>
-- yes, this should be better documented!
help("residuals,mer-method") [ugh] tells you that residuals() returns
the Pearson residuals by default:
‘resid’: The residuals, y-mu, weighted by the ‘sqrtrWt’ slot (when
its length is >0).
The last comments are that (1) you should be using lower.tail=FALSE
in your pchisq() call; (2) it's not clear what the "residual df" really
are for random effects models (subtracting the number of random effects
estimates is probably conservative --- but why aren't you subtracting
the fixed effect parameter df too?); (3) the Pearson residuals are
a fairly poor estimate of the scale parameter/overdispersion when the
data are far from conditionally normal (e.g. binomial or Poisson with
small numbers of counts) -- see Venables and Ripley; (4) there's more
information on all of this at http://glmm.wikidot.com/faq#overdispersion.
Ben Bolker
(Yes, this should be better documented!)
More information about the R-sig-mixed-models
mailing list