[R-sig-ME] Selecting random effects in lme4: ML vs. QAICc
Ben Bolker
bbolker at gmail.com
Tue Aug 17 18:42:51 CEST 2010
Richard Feldman wrote:
> Thank you very much for your informative response.
>
> I guess I'm confused about how to interpret overdispersion. Using the
> Zuur et al. Owls data that you also present in your worked example, I
> ran the following models, keeping the fixed and random effects just as
> you present it:
>
> glmer, family = gaussian
> glmer, family = poisson
> glmer, family = quasipoisson
> glmmPQL, family = quasipoisson
> glmmADMB, family = poisson ##zeroInflation=FALSE
> glmmADMB, family = negative binomial ##zeroInflation=FALSE
> glmmADMB, family = poisson ##zeroInflation=TRUE
> glmmADMB, family = negative binomial ##zeroInflation=TRUE
>
>
> For all but the quasipoisson glmer, I calculated dispersion following
> your example:
>
> >rdev<-sum(residuals(model)^2)
> >mdf<-length(fixef(model))
> >rdf<-nrow(Data)-mdf
> >rdev/rdf
>
> For the quasipoisson glmer, I extracted dispersion as:
>
> >lme4:::sigma(model)^2
>
> The results are as follows:
>
> # model dispersion
> #1 glmer.gaussian 35.534989
> #2 glmer.pois 5.630751
> #3 glmer.quasi -- sigma(model)^2 29.830221
> #4 glmmPQL.quasi 1.076906
> #5 glmmADMB.pois 7.585654
> #6 glmmADMB.nbinom 1.085072
> #7 glmmADMB.pois.zero 8.255389
> #8 glmmADMB.nbinom.zero 1.516587
>
> Am I right to interpret this as saying 1) the sigma(model)^2 method is
> inaccurate and 2) the glmmPQL.quasi and glmmADMB.nbinom are
> sufficiently correcting for the overdispersion? I had thought I had
> read you advising using the quasipoisson model to calculate the
> overdispersion parameter (assuming it did so correctly) needed to
> adjust AIC for QAIC. It would seem the dispersion parameter should
> come from the Poisson regression. More likely I just misread you.
> Also, is it a concern that the dispersion is higher in the
> zero-inflated models? Does this mean zero-inflation is not an issue,
> at least when wanting to calculate AIC values?
(1) Yes. I would agree that we don't quite know what's going on with
quasi- in glmer, and that using other methods is better if possible:
various people have reported odd results, Doug Bates has gone on record
as saying he wouldn't really know how to interpret a quasi-likelihood
GLMM anyway (I think that's a fair summary of his position), and it's
not clear whether the problem is with bugs in a little-tested corner of
the software or fundamental problems with the definitions of the model.
That said, quasi- is also the easiest way forward ...
(2) glmmPQL.quasi uses penalized quasi-likelihood, so at least it's
consistent in the way it handles the random effects and the individual
variance structures. PQL is known to be a bit dicey for data with small
numbers (e.g. means < 5) [Breslow 2003], not that that has stopped lots
of people from using it because for a long time it was the only game in
town. (3) The dispersion approx. 1 in the neg binom models does look
reasonable. See Venables and Ripley's section on overdispersion for some
cautions on this approach ... (4) I don't know why the deviance is
coming out slightly higher for the zero-inflated neg binom -- seems odd.
(5) Since you've gone to all this trouble to fit the overdispersed and
zero-inflated likelihood models, your best bet is to try likelihood
ratio tests between nested models (e.g. #8 vs #6 to test for
zero-inflation, #8 vs #7 or #6 vs #5 to test for overdispersion). The
quasi- and overdispersion calculations are usually done as a way of
avoiding having to the fit the more complex models at all.
More information about the R-sig-mixed-models
mailing list