[R-sig-ME] Mixed mutlinomial regression for count data with overdisperion & zero-inflation

Tue May 17 00:41:04 CEST 2016

Stéphanie Périquet <stephanie.periquet at ...> writes:

> 
> Dear list members,
> 
> First sorry for this very long first post …

  That's OK.  I'm only going to answer part of it, because it's long.
> 
> I am looking for advises to fit a mixed multinomial regression on count
> data that are overdispersed and zero-inflated. My question is to evaluate
> the effect of season and moonlight on diet composition of bat-eared foxes.
> My dataset is composed of 14 possible prey item, 20 individual foxes
> observed, 4 seasons and a moon illumination index ranging from 0 to 1 by
> 0.1 implements (considered as a continuous variable even if takes only 11
> values). For each unique combination of individual*season*moon, I thus has
> 14 lines, one for the count of each prey item.
> 
> From what I gathered, it would be possible to use 
> a standard glmm model of
> the following form to answer my question (ie a multinomial regression):
> 
> glmer(count~item+item:season+item:moon+offset(logduration)+
> (1+indiv)+(1|obs)+
> (1|id), family=poisson)

  Yes, but I don't know if this will account for the possible dependence
*among* prey types.

> 
> where count is the number of prey of a given type recorded eaten;
> 
> item is the prey type;
> 
> logduration is the log(total time observed for a given combination of
> individual*season*moon);
> 
> obs is a unique id for each combination of individual*season*moon, 
> so each
> obs value regroups 14 lines (one for each prey item) with the same
> individual*season*moon;
> 
> id is a unique id for each line to account for overdispersion (as
> quasi-poisson or negative binomial distributions are not implemented in
> lme4, Elston et al. 2001).

   Seems about right.
   There is glmer.nb now, but you might not want it; it tends to
be slower and more fragile, and you'd still have to deal with
zero-inflation.

> However, they are a lot of zeros in my data i.e. lot of prey items has
> never been observed being eaten for mane combinations of
> individual*season*moon.

  That doesn't *necessarily* mean you need zero-inflation. Large 
numbers of zeros might just reflect low probabilities, not ZI per se.

> Following Ben Bolker wiki (http://glmm.wikidot.com/faq) I summarize that I
> should use of the following methods to answer my question
> 
>    - ·      glmmADMB, with family=nbinom
>    - ·      MCMCglmm, with family=zipoisson
>    - ·      "expectation-maximization (EM) algorithm" in lme4

  Note there's a marginally newer version at 
https://rawgit.com/bbolker/mixedmodels-misc/master/glmmFAQ.html

  Another, newer choice is glmmTMB (available on Github) with
family="nbinom2"

> Here come the questions:

> 1.  1. Is it correct to assume that I could use the same model
> structure
> (count~item+item:season+item:moon+offset(logduration)+(1+indiv)+(1|obs))
> in glmmADMB or MCMCglmm to answer my question ?

  glmmADMB or glmmTMB, yes: I'm not sure about MCMCglmm

> 2.   I then wouldn't need the (1|id) to correct for overdispersion as both
> methods would already account for it, correct?

   That's right, I think.

> 3.   I am totally new to MCMCglmm, so  ...

  I'm going to let Jarrod Hadfield, or someone else, answer this one.
> 
> 4.     4.  If I were to use the EM algorithm method, 
> how should the results
> be interpreted?

  The result is composed of two models -- a 'binary' (structural zero vs
non-structural zero) and a 'conditional' (count) part.