[R-sig-ME] dispersion parameter in count data with lmer

Fri Mar 4 00:17:15 CET 2011

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 11-03-02 11:53 AM, alexandre martin wrote:
> Dear all,
> 
> I hope my question will not be considered too simple to be asked here. I
> will try to clearly explicit my problem.
> 
> I am working on longitudinal data to explore male reproductive success (rs).
> These data are therefore count data and I deal with it by specifying in my
> mixed models (random variable=individual identity) a poisson family.
> My data present a large number of zeros, as shows the table() function
> applyed to my response variable:
> 
> table(rs)
> 0   1   2   3   4   5   6   7
> 365  60  20   9   8   5   3   1
> 
> 
> I am wondering :
> 1- if it matches with the description of an overdispersion or with a zero
> inflated distribution.

  If this is the marginal distribution, it's hard to tell. A certain
fraction of the apparent overdispersion/zero-inflation can be explained
simply by differences in the means among treatment groups or random
effect levels.  Beyond that, a large number of zeros can potentially be
explained either via overdispersion or via zero-inflation (e.g. Warton,
David I. 2005. Many zeros does not mean zero inflation: comparing the
goodness-of-fit of parametric models to multivariate abundance data.
Environmetrics 16, no. 3: 275-289. doi:10.1002/env.702.
http://dx.doi.org/10.1002/env.702.)

  The easiest way to incorporate overdispersion in the current lme4
framework is to add an observation-level random effect (as discussed in
many posts on the list: also see <http://glmm.wikidot.com>).

> 2- how it is possible to calculate a c-hat, given that determining the
> number of parameters in glmm is not an easy task...

  This is a quasi-likelihood approach.  You can get an *approximation*
of the residual deviance via

sum(residuals(modelfit,type="pearson")^2)

What you use for numbers of parameters (specifically, how you count
degrees of freedom for random effects) may depend (as also discussed
frequently on this list and on glmm.wikidot.com) on what you are trying
do ...

> 3- if the realy bad fitted values obtained for the highest reproductive
> success are the result of a first estimation with a poisson familly.
> 

  See answers above ...

  You can use glmmADMB for negative binomial or zero-inflated
distributions (or both) with; at the moment it only allows one random
effect (we're working on it: contact me off-list if you are really
desperate); you can use MCMCglmm for zero-inflated data (MCMCglmm always
includes an observation-level random effect).

  Ben Bolker

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk1wIXsACgkQc5UpGjwzenPJsgCdEUb3PyWsqqKoXbXrYmneij6F
YqgAn1H201bmJJ/j0j490KKAVtnPQf9v
=4Cav
-----END PGP SIGNATURE-----