[R-sig-ME] mixed mutlinomial regression for count data with, overdisperion & zero-inflation

Tue May 17 20:21:26 CEST 2016

On 17/05/2016 18:53, Stéphanie Périquet wrote:
> Dear Alain,
>
> Thanks for your reply and advices! Will try to do that and wait for 
> your very timely paper to come out to be sure I did the right thing!
>

Stephanie,

Although it does not cover multinomial models directly, this one may be 
of use as well:

Beginner's Guide to Zero-Inflated Models with R (2016). Zuur AF and Ieno EN
http://highstat.com/BGZIM.htm

Sorry for the self-references.

Kind regards,

Alain

> Best,
> Stephanie
>
> On 17 May 2016 at 12:08, Highland Statistics Ltd 
> <highstat at highstat.com <mailto:highstat at highstat.com>> wrote:
>
>
>
>
>     >
>     ----------------------------------------------------------------------
>     >
>     > Message: 1
>     > Date: Tue, 17 May 2016 08:28:42 +0200
>     > From: St?phanie P?riquet <stephanie.periquet at gmail.com
>     <mailto:stephanie.periquet at gmail.com>>
>     > To: Ben Bolker <bbolker at gmail.com <mailto:bbolker at gmail.com>>
>     > Cc: r-sig-mixed-models at r-project.org
>     <mailto:r-sig-mixed-models at r-project.org>
>     > Subject: Re: [R-sig-ME] Mixed mutlinomial regression for count data
>     >       with overdisperion & zero-inflation
>     > Message-ID:
>     >     
>      <CAMKTVFXZnvS1g-FaNVQ1FQUj5u84S-fd=k4u_6x5PwJUZ2R+bQ at mail.gmail.com
>     <mailto:k4u_6x5PwJUZ2R%2BbQ at mail.gmail.com>>
>     > Content-Type: text/plain; charset="UTF-8"
>     >
>     > Hi Ben,
>     >
>     > Thank you very much for your answer!
>     >
>     > I am aware that a lot of zero doesn't mean zero inflation, but if my
>     > understanding is correct the only way to check for ZI would be
>     to compare
>     > one model take doesn't take it into account and another one that
>     does right?
>
>     Incorrect.
>     1. Calculate the percentage of zeros for your observed data.
>     2. Fit a model....this can be a model without zero inflation stuff.
>     3. Simulate 1000 data sets from your model and for each simulated data
>     set assess the percentage of zeros.
>     4. Compare the results in 3 with those in 1.
>
>     5. Even nicer....
>     5a. Plot a simple frequency table for the original data
>     (plot(table(Response), type = "h").
>     5b. Calculate a table() for each of your simulated data.
>     5c. Calculate the average frequency table.
>     5d. Compare 5a and 5c.
>
>     For a nice example and R code, see:
>     A protocol for conducting and presenting results of regression-type
>     analyses. Zuur & Ieno
>     doi: 10.1111/2041-210X.12577
>     Methods in Ecology and Evolution 2016
>
>     Comes out in 2 weeks or so.
>
>     Kind regards,
>
>     Alain
>
>
>     > With the model example I gave (count~item+item:season+item:
>     > moon+offset(logduration)+(1+indiv)+(1|obs)) glmmADMB doesn't run
>     but I'm
>     > gonna dig a bit more into this ans come back t you if I can't
>     figure it out.
>     >
>     > Best,
>     > Stephanie
>     >
>     > On 17 May 2016 at 00:41, Ben Bolker <bbolker at gmail.com
>     <mailto:bbolker at gmail.com>> wrote:
>     >
>     >> St?phanie P?riquet <stephanie.periquet at ...> writes:
>     >>
>     >>> Dear list members,
>     >>>
>     >>> First sorry for this very long first post ?
>     >>    That's OK.  I'm only going to answer part of it, because
>     it's long.
>     >>> I am looking for advises to fit a mixed multinomial regression
>     on count
>     >>> data that are overdispersed and zero-inflated. My question is
>     to evaluate
>     >>> the effect of season and moonlight on diet composition of
>     bat-eared
>     >> foxes.
>     >>> My dataset is composed of 14 possible prey item, 20 individual
>     foxes
>     >>> observed, 4 seasons and a moon illumination index ranging from
>     0 to 1 by
>     >>> 0.1 implements (considered as a continuous variable even if
>     takes only 11
>     >>> values). For each unique combination of
>     individual*season*moon, I thus
>     >> has
>     >>> 14 lines, one for the count of each prey item.
>     >>>
>     >>>  From what I gathered, it would be possible to use
>     >>> a standard glmm model of
>     >>> the following form to answer my question (ie a multinomial
>     regression):
>     >>>
>     >>> glmer(count~item+item:season+item:moon+offset(logduration)+
>     >>> (1+indiv)+(1|obs)+
>     >>> (1|id), family=poisson)
>     >>    Yes, but I don't know if this will account for the possible
>     dependence
>     >> *among* prey types.
>     >>
>     >>> where count is the number of prey of a given type recorded eaten;
>     >>>
>     >>> item is the prey type;
>     >>>
>     >>> logduration is the log(total time observed for a given
>     combination of
>     >>> individual*season*moon);
>     >>>
>     >>> obs is a unique id for each combination of individual*season*moon,
>     >>> so each
>     >>> obs value regroups 14 lines (one for each prey item) with the same
>     >>> individual*season*moon;
>     >>>
>     >>> id is a unique id for each line to account for overdispersion (as
>     >>> quasi-poisson or negative binomial distributions are not
>     implemented in
>     >>> lme4, Elston et al. 2001).
>     >>     Seems about right.
>     >>     There is glmer.nb now, but you might not want it; it tends to
>     >> be slower and more fragile, and you'd still have to deal with
>     >> zero-inflation.
>     >>
>     >>> However, they are a lot of zeros in my data i.e. lot of prey
>     items has
>     >>> never been observed being eaten for mane combinations of
>     >>> individual*season*moon.
>     >>    That doesn't *necessarily* mean you need zero-inflation. Large
>     >> numbers of zeros might just reflect low probabilities, not ZI
>     per se.
>     >>
>     >>> Following Ben Bolker wiki (http://glmm.wikidot.com/faq) I
>     summarize
>     >> that I
>     >>> should use of the following methods to answer my question
>     >>>
>     >>>     - ?      glmmADMB, with family=nbinom
>     >>>     - ?      MCMCglmm, with family=zipoisson
>     >>>     - ?      "expectation-maximization (EM) algorithm" in lme4
>     >>    Note there's a marginally newer version at
>     >> https://rawgit.com/bbolker/mixedmodels-misc/master/glmmFAQ.html
>     >>
>     >>    Another, newer choice is glmmTMB (available on Github) with
>     >> family="nbinom2"
>     >>
>     >>> Here come the questions:
>     >>> 1.  1. Is it correct to assume that I could use the same model
>     >>> structure
>     >>>
>     (count~item+item:season+item:moon+offset(logduration)+(1+indiv)+(1|obs))
>     >>> in glmmADMB or MCMCglmm to answer my question ?
>     >>    glmmADMB or glmmTMB, yes: I'm not sure about MCMCglmm
>     >>
>     >>> 2.   I then wouldn't need the (1|id) to correct for
>     overdispersion as
>     >> both
>     >>> methods would already account for it, correct?
>     >>     That's right, I think.
>     >>
>     >>> 3.   I am totally new to MCMCglmm, so  ...
>     >>    I'm going to let Jarrod Hadfield, or someone else, answer
>     this one.
>     >>> 4.     4.  If I were to use the EM algorithm method,
>     >>> how should the results
>     >>> be interpreted?
>     >>    The result is composed of two models -- a 'binary'
>     (structural zero vs
>     >> non-structural zero) and a 'conditional' (count) part.
>     >> _______________________________________________
>     >> R-sig-mixed-models at r-project.org
>     <mailto:R-sig-mixed-models at r-project.org> mailing list
>     >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>     >
>     >
>     >
>
>     --
>     Dr. Alain F. Zuur
>
>     First author of:
>     1. Beginner's Guide to GAMM with R (2014).
>     2. Beginner's Guide to GLM and GLMM with R (2013).
>     3. Beginner's Guide to GAM with R (2012).
>     4. Zero Inflated Models and GLMM with R (2012).
>     5. A Beginner's Guide to R (2009).
>     6. Mixed effects models and extensions in ecology with R (2009).
>     7. Analysing Ecological Data (2007).
>
>     Highland Statistics Ltd.
>     9 St Clair Wynd
>     UK - AB41 6DZ Newburgh
>     Tel:   0044 1358 788177
>     Email: highstat at highstat.com <mailto:highstat at highstat.com>
>     URL: www.highstat.com <http://www.highstat.com>
>
>
>             [[alternative HTML version deleted]]
>
>     _______________________________________________
>     R-sig-mixed-models at r-project.org
>     <mailto:R-sig-mixed-models at r-project.org> mailing list
>     https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
>
>
>
> -- 
> *Stéphanie PERIQUET (PhD) * - Bat-eared Fox Research Project
> /Dept of Zoology & Entomology/
> /University of the Free State, Qwaqwa Campus/
> *Cell: +27 79 570 2683*
> ResearchGate profile 
> <https://www.researchgate.net/profile/Stephanie_Periquet>
>
>
> Kalahari bat-eared foxes on Twitter <https://twitter.com/kal_batearedfox>

-- 
Dr. Alain F. Zuur

First author of:
1. Beginner's Guide to GAMM with R (2014).
2. Beginner's Guide to GLM and GLMM with R (2013).
3. Beginner's Guide to GAM with R (2012).
4. Zero Inflated Models and GLMM with R (2012).
5. A Beginner's Guide to R (2009).
6. Mixed effects models and extensions in ecology with R (2009).
7. Analysing Ecological Data (2007).

Highland Statistics Ltd.
9 St Clair Wynd
UK - AB41 6DZ Newburgh
Tel:   0044 1358 788177
Email: highstat at highstat.com
URL:   www.highstat.com

	[[alternative HTML version deleted]]