[R-sig-ME] mixed mutlinomial regression for count data with, overdisperion & zero-inflation
Highland Statistics Ltd
highstat at highstat.com
Wed May 18 09:39:53 CEST 2016
On 18/05/2016 08:26, Stéphanie Périquet wrote:
> Yeah thanks Alain, I'm definitely planning to buy this book!
>
> So I looked at the zeros in my data abased on you advice and I did the
> following:
> mod<-glmer(count~item+item:season+item:moon+item:season:moon+(1|indiv/obs)+(1|id),family=poisson,nAGQ=0,data=diet3)
> z<-simulate(mod,nsim=1000)
>
> For the original data I have 69.3% of zeros while the average over the
> 1000 simulations is 63.5%.Is there a way to statistically compare
> these 2 values? Or could you say that these 2 figures are not very
> different and then zero inflation models might not be necessary?
>
Stephanie,
Make a histogram of the 1000 values of the percentages of zeros....and
present the 69.3% as a big blue/red dot. If the dot for your observed
data is in the tails you have a problem.
I don't see the point of a test in your case. Such a simulation is close
to bootstrapping...so I guess you can come up with a test somehow. If
you do this type of analysis in a Bayesian framework it is often (and
confusingly) called a Bayesian p-value (counting how often the simulated
value is larger than your observed one).
I would just go for the histogram...seems you are lucky.
Alain
> Best,
> Stephanie
>
> On 17 May 2016 at 20:21, Highland Statistics Ltd
> <highstat at highstat.com <mailto:highstat at highstat.com>> wrote:
>
>
>
> On 17/05/2016 18:53, Stéphanie Périquet wrote:
>> Dear Alain,
>>
>> Thanks for your reply and advices! Will try to do that and wait
>> for your very timely paper to come out to be sure I did the right
>> thing!
>>
>
> Stephanie,
>
> Although it does not cover multinomial models directly, this one
> may be of use as well:
>
> Beginner's Guide to Zero-Inflated Models with R (2016). Zuur AF
> and Ieno EN
> http://highstat.com/BGZIM.htm
>
> Sorry for the self-references.
>
> Kind regards,
>
> Alain
>
>
>> Best,
>> Stephanie
>>
>> On 17 May 2016 at 12:08, Highland Statistics Ltd
>> <highstat at highstat.com <mailto:highstat at highstat.com>> wrote:
>>
>>
>>
>>
>> >
>> ----------------------------------------------------------------------
>> >
>> > Message: 1
>> > Date: Tue, 17 May 2016 08:28:42 +0200
>> > From: St?phanie P?riquet <stephanie.periquet at gmail.com
>> <mailto:stephanie.periquet at gmail.com>>
>> > To: Ben Bolker <bbolker at gmail.com <mailto:bbolker at gmail.com>>
>> > Cc: r-sig-mixed-models at r-project.org
>> <mailto:r-sig-mixed-models at r-project.org>
>> > Subject: Re: [R-sig-ME] Mixed mutlinomial regression for
>> count data
>> > with overdisperion & zero-inflation
>> > Message-ID:
>> >
>> <CAMKTVFXZnvS1g-FaNVQ1FQUj5u84S-fd=k4u_6x5PwJUZ2R+bQ at mail.gmail.com
>> <mailto:k4u_6x5PwJUZ2R+bQ at mail.gmail.com>>
>> > Content-Type: text/plain; charset="UTF-8"
>> >
>> > Hi Ben,
>> >
>> > Thank you very much for your answer!
>> >
>> > I am aware that a lot of zero doesn't mean zero inflation,
>> but if my
>> > understanding is correct the only way to check for ZI would
>> be to compare
>> > one model take doesn't take it into account and another one
>> that does right?
>>
>> Incorrect.
>> 1. Calculate the percentage of zeros for your observed data.
>> 2. Fit a model....this can be a model without zero inflation
>> stuff.
>> 3. Simulate 1000 data sets from your model and for each
>> simulated data
>> set assess the percentage of zeros.
>> 4. Compare the results in 3 with those in 1.
>>
>> 5. Even nicer....
>> 5a. Plot a simple frequency table for the original data
>> (plot(table(Response), type = "h").
>> 5b. Calculate a table() for each of your simulated data.
>> 5c. Calculate the average frequency table.
>> 5d. Compare 5a and 5c.
>>
>> For a nice example and R code, see:
>> A protocol for conducting and presenting results of
>> regression-type
>> analyses. Zuur & Ieno
>> doi: 10.1111/2041-210X.12577
>> Methods in Ecology and Evolution 2016
>>
>> Comes out in 2 weeks or so.
>>
>> Kind regards,
>>
>> Alain
>>
>>
>> > With the model example I gave (count~item+item:season+item:
>> > moon+offset(logduration)+(1+indiv)+(1|obs)) glmmADMB
>> doesn't run but I'm
>> > gonna dig a bit more into this ans come back t you if I
>> can't figure it out.
>> >
>> > Best,
>> > Stephanie
>> >
>> > On 17 May 2016 at 00:41, Ben Bolker <bbolker at gmail.com
>> <mailto:bbolker at gmail.com>> wrote:
>> >
>> >> St?phanie P?riquet <stephanie.periquet at ...>
>> <mailto:stephanie.periquet at ...> writes:
>> >>
>> >>> Dear list members,
>> >>>
>> >>> First sorry for this very long first post ?
>> >> That's OK. I'm only going to answer part of it,
>> because it's long.
>> >>> I am looking for advises to fit a mixed multinomial
>> regression on count
>> >>> data that are overdispersed and zero-inflated. My
>> question is to evaluate
>> >>> the effect of season and moonlight on diet composition of
>> bat-eared
>> >> foxes.
>> >>> My dataset is composed of 14 possible prey item, 20
>> individual foxes
>> >>> observed, 4 seasons and a moon illumination index ranging
>> from 0 to 1 by
>> >>> 0.1 implements (considered as a continuous variable even
>> if takes only 11
>> >>> values). For each unique combination of
>> individual*season*moon, I thus
>> >> has
>> >>> 14 lines, one for the count of each prey item.
>> >>>
>> >>> From what I gathered, it would be possible to use
>> >>> a standard glmm model of
>> >>> the following form to answer my question (ie a
>> multinomial regression):
>> >>>
>> >>> glmer(count~item+item:season+item:moon+offset(logduration)+
>> >>> (1+indiv)+(1|obs)+
>> >>> (1|id), family=poisson)
>> >> Yes, but I don't know if this will account for the
>> possible dependence
>> >> *among* prey types.
>> >>
>> >>> where count is the number of prey of a given type
>> recorded eaten;
>> >>>
>> >>> item is the prey type;
>> >>>
>> >>> logduration is the log(total time observed for a given
>> combination of
>> >>> individual*season*moon);
>> >>>
>> >>> obs is a unique id for each combination of
>> individual*season*moon,
>> >>> so each
>> >>> obs value regroups 14 lines (one for each prey item) with
>> the same
>> >>> individual*season*moon;
>> >>>
>> >>> id is a unique id for each line to account for
>> overdispersion (as
>> >>> quasi-poisson or negative binomial distributions are not
>> implemented in
>> >>> lme4, Elston et al. 2001).
>> >> Seems about right.
>> >> There is glmer.nb now, but you might not want it; it
>> tends to
>> >> be slower and more fragile, and you'd still have to deal with
>> >> zero-inflation.
>> >>
>> >>> However, they are a lot of zeros in my data i.e. lot of
>> prey items has
>> >>> never been observed being eaten for mane combinations of
>> >>> individual*season*moon.
>> >> That doesn't *necessarily* mean you need
>> zero-inflation. Large
>> >> numbers of zeros might just reflect low probabilities, not
>> ZI per se.
>> >>
>> >>> Following Ben Bolker wiki (http://glmm.wikidot.com/faq) I
>> summarize
>> >> that I
>> >>> should use of the following methods to answer my question
>> >>>
>> >>> - ? glmmADMB, with family=nbinom
>> >>> - ? MCMCglmm, with family=zipoisson
>> >>> - ? "expectation-maximization (EM) algorithm" in lme4
>> >> Note there's a marginally newer version at
>> >>
>> https://rawgit.com/bbolker/mixedmodels-misc/master/glmmFAQ.html
>> >>
>> >> Another, newer choice is glmmTMB (available on Github) with
>> >> family="nbinom2"
>> >>
>> >>> Here come the questions:
>> >>> 1. 1. Is it correct to assume that I could use the same
>> model
>> >>> structure
>> >>>
>> (count~item+item:season+item:moon+offset(logduration)+(1+indiv)+(1|obs))
>> >>> in glmmADMB or MCMCglmm to answer my question ?
>> >> glmmADMB or glmmTMB, yes: I'm not sure about MCMCglmm
>> >>
>> >>> 2. I then wouldn't need the (1|id) to correct for
>> overdispersion as
>> >> both
>> >>> methods would already account for it, correct?
>> >> That's right, I think.
>> >>
>> >>> 3. I am totally new to MCMCglmm, so ...
>> >> I'm going to let Jarrod Hadfield, or someone else,
>> answer this one.
>> >>> 4. 4. If I were to use the EM algorithm method,
>> >>> how should the results
>> >>> be interpreted?
>> >> The result is composed of two models -- a 'binary'
>> (structural zero vs
>> >> non-structural zero) and a 'conditional' (count) part.
>> >> _______________________________________________
>> >> R-sig-mixed-models at r-project.org
>> <mailto:R-sig-mixed-models at r-project.org> mailing list
>> >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>> >
>> >
>> >
>>
>> --
>> Dr. Alain F. Zuur
>>
>> First author of:
>> 1. Beginner's Guide to GAMM with R (2014).
>> 2. Beginner's Guide to GLM and GLMM with R (2013).
>> 3. Beginner's Guide to GAM with R (2012).
>> 4. Zero Inflated Models and GLMM with R (2012).
>> 5. A Beginner's Guide to R (2009).
>> 6. Mixed effects models and extensions in ecology with R (2009).
>> 7. Analysing Ecological Data (2007).
>>
>> Highland Statistics Ltd.
>> 9 St Clair Wynd
>> UK - AB41 6DZ Newburgh
>> Tel: 0044 1358 788177
>> Email: highstat at highstat.com <mailto:highstat at highstat.com>
>> URL: www.highstat.com <http://www.highstat.com>
>>
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org
>> <mailto:R-sig-mixed-models at r-project.org> mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>>
>>
>>
>> --
>> *Stéphanie PERIQUET (PhD) * - Bat-eared Fox Research Project
>> /Dept of Zoology & Entomology/
>> /University of the Free State, Qwaqwa Campus/
>> *Cell: +27 79 570 2683*
>> ResearchGate profile
>> <https://www.researchgate.net/profile/Stephanie_Periquet>
>>
>>
>> Kalahari bat-eared foxes on Twitter
>> <https://twitter.com/kal_batearedfox>
>
> --
> Dr. Alain F. Zuur
>
> First author of:
> 1. Beginner's Guide to GAMM with R (2014).
> 2. Beginner's Guide to GLM and GLMM with R (2013).
> 3. Beginner's Guide to GAM with R (2012).
> 4. Zero Inflated Models and GLMM with R (2012).
> 5. A Beginner's Guide to R (2009).
> 6. Mixed effects models and extensions in ecology with R (2009).
> 7. Analysing Ecological Data (2007).
>
> Highland Statistics Ltd.
> 9 St Clair Wynd
> UK - AB41 6DZ Newburgh
> Tel: 0044 1358 788177
> Email:highstat at highstat.com <mailto:highstat at highstat.com>
> URL:www.highstat.com <http://www.highstat.com>
>
>
>
>
> --
> *Stéphanie PERIQUET (PhD) * - Bat-eared Fox Research Project
> /Dept of Zoology & Entomology/
> /University of the Free State, Qwaqwa Campus/
> *Cell: +27 79 570 2683*
> ResearchGate profile
> <https://www.researchgate.net/profile/Stephanie_Periquet>
>
>
> Kalahari bat-eared foxes on Twitter <https://twitter.com/kal_batearedfox>
--
Dr. Alain F. Zuur
First author of:
1. Beginner's Guide to GAMM with R (2014).
2. Beginner's Guide to GLM and GLMM with R (2013).
3. Beginner's Guide to GAM with R (2012).
4. Zero Inflated Models and GLMM with R (2012).
5. A Beginner's Guide to R (2009).
6. Mixed effects models and extensions in ecology with R (2009).
7. Analysing Ecological Data (2007).
Highland Statistics Ltd.
9 St Clair Wynd
UK - AB41 6DZ Newburgh
Tel: 0044 1358 788177
Email: highstat at highstat.com
URL: www.highstat.com
[[alternative HTML version deleted]]
More information about the R-sig-mixed-models
mailing list