[R-sig-ME] mixed mutlinomial regression for count data with, overdisperion & zero-inflation

Stéphanie Périquet stephanie.periquet at gmail.com
Wed May 18 10:48:22 CEST 2016


Ok thanks for the details. And this histogram confirms I have zero
inflation indeed, the value from the original data is out of the
distribution from the simulated data...

Best,
Stephanie

On 18 May 2016 at 09:39, Highland Statistics Ltd <highstat at highstat.com>
wrote:

>
>
> On 18/05/2016 08:26, Stéphanie Périquet wrote:
>
> Yeah thanks Alain, I'm definitely planning to buy this book!
>
> So I looked at the zeros in my data abased on you advice and I did the
> following:
>
> mod<-glmer(count~item+item:season+item:moon+item:season:moon+(1|indiv/obs)+(1|id),family=poisson,nAGQ=0,data=diet3)
> z<-simulate(mod,nsim=1000)
>
> For the original data I have 69.3% of zeros while the average over the
> 1000 simulations is 63.5%.Is there a way to statistically compare these 2
> values? Or could you say that these 2 figures are not very different and
> then zero inflation models might not be necessary?
>
>
> Stephanie,
>
> Make a histogram of the 1000 values of the percentages of zeros....and
> present the 69.3% as a big blue/red dot. If the dot for your observed data
> is in the tails you have a problem.
>
> I don't see the point of a test in your case. Such a simulation is close
> to bootstrapping...so I guess you can come up with a test somehow. If you
> do this type of analysis in a Bayesian framework it is often (and
> confusingly) called a Bayesian p-value (counting how often the simulated
> value is larger than your observed one).
>
> I would just go for the histogram...seems you are lucky.
>
>
> Alain
>
>
>
>
>
>
> Best,
> Stephanie
>
> On 17 May 2016 at 20:21, Highland Statistics Ltd <highstat at highstat.com>
> wrote:
>
>>
>>
>> On 17/05/2016 18:53, Stéphanie Périquet wrote:
>>
>> Dear Alain,
>>
>> Thanks for your reply and advices! Will try to do that and wait for your
>> very timely paper to come out to be sure I did the right thing!
>>
>>
>> Stephanie,
>>
>> Although it does not cover multinomial models directly, this one may be
>> of use as well:
>>
>> Beginner's Guide to Zero-Inflated Models with R (2016). Zuur AF and Ieno
>> EN
>> http://highstat.com/BGZIM.htm
>>
>> Sorry for the self-references.
>>
>> Kind regards,
>>
>> Alain
>>
>>
>> Best,
>> Stephanie
>>
>> On 17 May 2016 at 12:08, Highland Statistics Ltd <
>> <highstat at highstat.com>highstat at highstat.com> wrote:
>>
>>>
>>>
>>>
>>> > ----------------------------------------------------------------------
>>> >
>>> > Message: 1
>>> > Date: Tue, 17 May 2016 08:28:42 +0200
>>> > From: St?phanie P?riquet < <stephanie.periquet at gmail.com>
>>> stephanie.periquet at gmail.com>
>>> > To: Ben Bolker < <bbolker at gmail.com>bbolker at gmail.com>
>>> > Cc: r-sig-mixed-models at r-project.org
>>> > Subject: Re: [R-sig-ME] Mixed mutlinomial regression for count data
>>> >       with overdisperion & zero-inflation
>>> > Message-ID:
>>> >       <CAMKTVFXZnvS1g-FaNVQ1FQUj5u84S-fd=
>>> <k4u_6x5PwJUZ2R+bQ at mail.gmail.com>k4u_6x5PwJUZ2R+bQ at mail.gmail.com>
>>> > Content-Type: text/plain; charset="UTF-8"
>>> >
>>> > Hi Ben,
>>> >
>>> > Thank you very much for your answer!
>>> >
>>> > I am aware that a lot of zero doesn't mean zero inflation, but if my
>>> > understanding is correct the only way to check for ZI would be to
>>> compare
>>> > one model take doesn't take it into account and another one that does
>>> right?
>>>
>>> Incorrect.
>>> 1. Calculate the percentage of zeros for your observed data.
>>> 2. Fit a model....this can be a model without zero inflation stuff.
>>> 3. Simulate 1000 data sets from your model and for each simulated data
>>> set assess the percentage of zeros.
>>> 4. Compare the results in 3 with those in 1.
>>>
>>> 5. Even nicer....
>>> 5a. Plot a simple frequency table for the original data
>>> (plot(table(Response), type = "h").
>>> 5b. Calculate a table() for each of your simulated data.
>>> 5c. Calculate the average frequency table.
>>> 5d. Compare 5a and 5c.
>>>
>>> For a nice example and R code, see:
>>> A protocol for conducting and presenting results of regression-type
>>> analyses. Zuur & Ieno
>>> doi: 10.1111/2041-210X.12577
>>> Methods in Ecology and Evolution 2016
>>>
>>> Comes out in 2 weeks or so.
>>>
>>> Kind regards,
>>>
>>> Alain
>>>
>>>
>>> > With the model example I gave (count~item+item:season+item:
>>> > moon+offset(logduration)+(1+indiv)+(1|obs)) glmmADMB doesn't run but
>>> I'm
>>> > gonna dig a bit more into this ans come back t you if I can't figure
>>> it out.
>>> >
>>> > Best,
>>> > Stephanie
>>> >
>>> > On 17 May 2016 at 00:41, Ben Bolker < <bbolker at gmail.com>
>>> bbolker at gmail.com> wrote:
>>> >
>>> >> St?phanie P?riquet <stephanie.periquet at ...><stephanie.periquet at ...>
>>> <stephanie.periquet at ...> writes:
>>> >>
>>> >>> Dear list members,
>>> >>>
>>> >>> First sorry for this very long first post ?
>>> >>    That's OK.  I'm only going to answer part of it, because it's long.
>>> >>> I am looking for advises to fit a mixed multinomial regression on
>>> count
>>> >>> data that are overdispersed and zero-inflated. My question is to
>>> evaluate
>>> >>> the effect of season and moonlight on diet composition of bat-eared
>>> >> foxes.
>>> >>> My dataset is composed of 14 possible prey item, 20 individual foxes
>>> >>> observed, 4 seasons and a moon illumination index ranging from 0 to
>>> 1 by
>>> >>> 0.1 implements (considered as a continuous variable even if takes
>>> only 11
>>> >>> values). For each unique combination of individual*season*moon, I
>>> thus
>>> >> has
>>> >>> 14 lines, one for the count of each prey item.
>>> >>>
>>> >>>  From what I gathered, it would be possible to use
>>> >>> a standard glmm model of
>>> >>> the following form to answer my question (ie a multinomial
>>> regression):
>>> >>>
>>> >>> glmer(count~item+item:season+item:moon+offset(logduration)+
>>> >>> (1+indiv)+(1|obs)+
>>> >>> (1|id), family=poisson)
>>> >>    Yes, but I don't know if this will account for the possible
>>> dependence
>>> >> *among* prey types.
>>> >>
>>> >>> where count is the number of prey of a given type recorded eaten;
>>> >>>
>>> >>> item is the prey type;
>>> >>>
>>> >>> logduration is the log(total time observed for a given combination of
>>> >>> individual*season*moon);
>>> >>>
>>> >>> obs is a unique id for each combination of individual*season*moon,
>>> >>> so each
>>> >>> obs value regroups 14 lines (one for each prey item) with the same
>>> >>> individual*season*moon;
>>> >>>
>>> >>> id is a unique id for each line to account for overdispersion (as
>>> >>> quasi-poisson or negative binomial distributions are not implemented
>>> in
>>> >>> lme4, Elston et al. 2001).
>>> >>     Seems about right.
>>> >>     There is glmer.nb now, but you might not want it; it tends to
>>> >> be slower and more fragile, and you'd still have to deal with
>>> >> zero-inflation.
>>> >>
>>> >>> However, they are a lot of zeros in my data i.e. lot of prey items
>>> has
>>> >>> never been observed being eaten for mane combinations of
>>> >>> individual*season*moon.
>>> >>    That doesn't *necessarily* mean you need zero-inflation. Large
>>> >> numbers of zeros might just reflect low probabilities, not ZI per se.
>>> >>
>>> >>> Following Ben Bolker wiki ( <http://glmm.wikidot.com/faq>
>>> http://glmm.wikidot.com/faq) I summarize
>>> >> that I
>>> >>> should use of the following methods to answer my question
>>> >>>
>>> >>>     - ?      glmmADMB, with family=nbinom
>>> >>>     - ?      MCMCglmm, with family=zipoisson
>>> >>>     - ?      "expectation-maximization (EM) algorithm" in lme4
>>> >>    Note there's a marginally newer version at
>>> >> https://rawgit.com/bbolker/mixedmodels-misc/master/glmmFAQ.html
>>> >>
>>> >>    Another, newer choice is glmmTMB (available on Github) with
>>> >> family="nbinom2"
>>> >>
>>> >>> Here come the questions:
>>> >>> 1.  1. Is it correct to assume that I could use the same model
>>> >>> structure
>>> >>>
>>> (count~item+item:season+item:moon+offset(logduration)+(1+indiv)+(1|obs))
>>> >>> in glmmADMB or MCMCglmm to answer my question ?
>>> >>    glmmADMB or glmmTMB, yes: I'm not sure about MCMCglmm
>>> >>
>>> >>> 2.   I then wouldn't need the (1|id) to correct for overdispersion as
>>> >> both
>>> >>> methods would already account for it, correct?
>>> >>     That's right, I think.
>>> >>
>>> >>> 3.   I am totally new to MCMCglmm, so  ...
>>> >>    I'm going to let Jarrod Hadfield, or someone else, answer this one.
>>> >>> 4.     4.  If I were to use the EM algorithm method,
>>> >>> how should the results
>>> >>> be interpreted?
>>> >>    The result is composed of two models -- a 'binary' (structural
>>> zero vs
>>> >> non-structural zero) and a 'conditional' (count) part.
>>> >> _______________________________________________
>>> >> R-sig-mixed-models at r-project.org mailing list
>>> >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>> >
>>> >
>>> >
>>>
>>> --
>>> Dr. Alain F. Zuur
>>>
>>> First author of:
>>> 1. Beginner's Guide to GAMM with R (2014).
>>> 2. Beginner's Guide to GLM and GLMM with R (2013).
>>> 3. Beginner's Guide to GAM with R (2012).
>>> 4. Zero Inflated Models and GLMM with R (2012).
>>> 5. A Beginner's Guide to R (2009).
>>> 6. Mixed effects models and extensions in ecology with R (2009).
>>> 7. Analysing Ecological Data (2007).
>>>
>>> Highland Statistics Ltd.
>>> 9 St Clair Wynd
>>> UK - AB41 6DZ Newburgh
>>> Tel:   0044 1358 788177
>>> Email: highstat at highstat.com
>>> URL:   www.highstat.com
>>>
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> R-sig-mixed-models at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
>>
>>
>>
>> --
>> *Stéphanie PERIQUET (PhD) * - Bat-eared Fox Research Project
>> *Dept of Zoology & Entomology*
>> *University of the Free State, Qwaqwa Campus*
>> *Cell: +27 79 570 2683*
>> ResearchGate profile
>> <https://www.researchgate.net/profile/Stephanie_Periquet>
>>
>>
>> Kalahari bat-eared foxes on Twitter <https://twitter.com/kal_batearedfox>
>>
>>
>> --
>> Dr. Alain F. Zuur
>>
>> First author of:
>> 1. Beginner's Guide to GAMM with R (2014).
>> 2. Beginner's Guide to GLM and GLMM with R (2013).
>> 3. Beginner's Guide to GAM with R (2012).
>> 4. Zero Inflated Models and GLMM with R (2012).
>> 5. A Beginner's Guide to R (2009).
>> 6. Mixed effects models and extensions in ecology with R (2009).
>> 7. Analysing Ecological Data (2007).
>>
>> Highland Statistics Ltd.
>> 9 St Clair Wynd
>> UK - AB41 6DZ Newburgh
>> Tel:   0044 1358 788177
>> Email: highstat at highstat.com
>> URL:   www.highstat.com
>>
>>
>
>
> --
> *Stéphanie PERIQUET (PhD) * - Bat-eared Fox Research Project
> *Dept of Zoology & Entomology*
> *University of the Free State, Qwaqwa Campus*
> *Cell: +27 79 570 2683*
> ResearchGate profile
> <https://www.researchgate.net/profile/Stephanie_Periquet>
>
>
> Kalahari bat-eared foxes on Twitter <https://twitter.com/kal_batearedfox>
>
>
> --
> Dr. Alain F. Zuur
>
> First author of:
> 1. Beginner's Guide to GAMM with R (2014).
> 2. Beginner's Guide to GLM and GLMM with R (2013).
> 3. Beginner's Guide to GAM with R (2012).
> 4. Zero Inflated Models and GLMM with R (2012).
> 5. A Beginner's Guide to R (2009).
> 6. Mixed effects models and extensions in ecology with R (2009).
> 7. Analysing Ecological Data (2007).
>
> Highland Statistics Ltd.
> 9 St Clair Wynd
> UK - AB41 6DZ Newburgh
> Tel:   0044 1358 788177
> Email: highstat at highstat.com
> URL:   www.highstat.com
>
>


-- 
*Stéphanie PERIQUET (PhD) * - Bat-eared Fox Research Project
*Dept of Zoology & Entomology*
*University of the Free State, Qwaqwa Campus*
*Cell: +27 79 570 2683*
ResearchGate profile
<https://www.researchgate.net/profile/Stephanie_Periquet>


Kalahari bat-eared foxes on Twitter <https://twitter.com/kal_batearedfox>

	[[alternative HTML version deleted]]



More information about the R-sig-mixed-models mailing list