[R-sig-ME] Question about zero-inflated Poisson glmer

Thu Jun 23 23:07:29 CEST 2016

glmmTMB does crossed RE. Ben did some timings in vignette("glmmTMB") and it was 2.3 times faster than glmer for one simple GLMM.

> On 23Jun 2016, at 10:44, Philipp Singer <killver at gmail.com> wrote:
> 
> Did try glmmADMB but unfortunately it is way too slow for my data.
> 
> Did not know about glmmTMB, will try it out. Does it work with crossed random effects and how does it scale with more data? I will check the docu and try it though. Thanks for the info.
> 
> On 23.06.2016 19:14, Ben Bolker wrote:
>>   I would also comment that glmmTMB is likely to be much faster than the
>> lme4-based EM approach ...
>> 
>>   cheers
>>     Ben B.
>> 
>> On 16-06-23 12:47 PM, Mollie Brooks wrote:
>>> Hi Philipp,
>>> 
>>> You could also try fitting the model with and without ZI using either
>>> glmmADMB or glmmTMB. Then compare the AICs. I believe model selection
>>> is useful for this, but I could be missing something since the
>>> simulation procedure that Thierry described seems to recommended more
>>> often.
>>> 
>>> https://github.com/glmmTMB/glmmTMB
>>> http://glmmadmb.r-forge.r-project.org
>>> 
>>> glmmTMB is still in the development phase, but we’ve done a lot of
>>> testing.
>>> 
>>> cheers, Mollie
>>> 
>>> ------------------------ Mollie Brooks, PhD Postdoctoral Researcher,
>>> Population Ecology Research Group Department of Evolutionary Biology
>>> & Environmental Studies, University of Zürich
>>> http://www.popecol.org/team/mollie-brooks/
>>> 
>>> 
>>>> On 23Jun 2016, at 8:22, Philipp Singer <killver at gmail.com> wrote:
>>>> 
>>>> Thanks, great information, that is really helpful.
>>>> 
>>>> I agree that those are different things, however when using a
>>>> random effect for overdispersion, I can simulate the same number of
>>>> zero outcomes (~95%).
>>>> 
>>>> On 23.06.2016 15:50, Thierry Onkelinx wrote:
>>>>> Be careful when using overdispersion to model zero-inflation.
>>>>> Those are two different things.
>>>>> 
>>>>> I've put some information together in
>>>>> http://rpubs.com/INBOstats/zeroinflation
>>>>> 
>>>>> ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek /
>>>>> Research Institute for Nature and Forest team Biometrie &
>>>>> Kwaliteitszorg / team Biometrics & Quality Assurance
>>>>> Kliniekstraat 25 1070 Anderlecht Belgium
>>>>> 
>>>>> To call in the statistician after the experiment is done may be
>>>>> no more than asking him to perform a post-mortem examination: he
>>>>> may be able to say what the experiment died of. ~ Sir Ronald
>>>>> Aylmer Fisher The plural of anecdote is not data. ~ Roger
>>>>> Brinner The combination of some data and an aching desire for an
>>>>> answer does not ensure that a reasonable answer can be extracted
>>>>> from a given body of data. ~ John Tukey
>>>>> 
>>>>> 2016-06-23 12:42 GMT+02:00 Philipp Singer <killver at gmail.com
>>>>> <mailto:killver at gmail.com <mailto:killver at gmail.com>>>:
>>>>> 
>>>>> Thanks! Actually, accounting for overdispersion is super
>>>>> important as it seems, then the zeros can be captured well.
>>>>> 
>>>>> 
>>>>> On 23.06.2016 11:50, Thierry Onkelinx wrote:
>>>>>> Dear Philipp,
>>>>>> 
>>>>>> 1. Fit a Poisson model to the data. 2. Simulate a new response
>>>>>> vector for the dataset according to the model. 3. Count the
>>>>>> number of zero's in the simulated response vector. 4. Repeat
>>>>>> step 2 and 3 a decent number of time and plot a histogram of
>>>>>> the number of zero's in the simulation. If the number of zero's
>>>>>> in the original dataset is larger than those in the
>>>>>> simulations, then the model can't capture all zero's. In such
>>>>>> case, first try to update the model and repeat the procedure.
>>>>>> If that fails, look for zero-inflated models.
>>>>>> 
>>>>>> Best regards,
>>>>>> 
>>>>>> ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek /
>>>>>> Research Institute for Nature and Forest team Biometrie &
>>>>>> Kwaliteitszorg / team Biometrics & Quality Assurance
>>>>>> Kliniekstraat 25 1070 Anderlecht Belgium
>>>>>> 
>>>>>> To call in the statistician after the experiment is done may
>>>>>> be no more than asking him to perform a post-mortem
>>>>>> examination: he may be able to say what the experiment died of.
>>>>>> ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data.
>>>>>> ~ Roger Brinner The combination of some data and an aching
>>>>>> desire for an answer does not ensure that a reasonable answer
>>>>>> can be extracted from a given body of data. ~ John Tukey
>>>>>> 
>>>>>> 2016-06-23 11:27 GMT+02:00 Philipp Singer <killver at gmail.com
>>>>>> <mailto:killver at gmail.com <mailto:killver at gmail.com>>>:
>>>>>> 
>>>>>> Thanks Thierry - That totally makes sense. Is there some way of
>>>>>> formally checking that, except thinking about the setting and
>>>>>> underlying processes?
>>>>>> 
>>>>>> On 23.06.2016 11:04, Thierry Onkelinx wrote:
>>>>>>> Dear Philipp,
>>>>>>> 
>>>>>>> Do you have just lots of zero's, or more zero's than the
>>>>>> Poisson
>>>>>>> distribution can explain? Those are two different things.
>>>>>> The example
>>>>>>> below generates data from a Poisson distribution and has
>>>>>> 99% zero's
>>>>>>> but no zero-inflation. The second example has only 1%
>>>>>> zero's but is
>>>>>>> clearly zero-inflated.
>>>>>>> 
>>>>>>> set.seed(1) n <- 1e8 sim <- rpois(n, lambda = 0.01) mean(sim
>>>>>>> == 0) hist(sim)
>>>>>>> 
>>>>>>> sim.infl <- rbinom(n, size = 1, prob = 0.99) * rpois(n,
>>>>>> lambda = 1000)
>>>>>>> mean(sim.infl == 0) hist(sim.infl)
>>>>>>> 
>>>>>>> So before looking for zero-inflated models, try to model
>>>>>> the zero's.
>>>>>>> Best regards,
>>>>>>> 
>>>>>>> 
>>>>>>> ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek /
>>>>>>> Research Institute
>>>>>> for Nature
>>>>>>> and Forest team Biometrie & Kwaliteitszorg / team Biometrics
>>>>>>> & Quality
>>>>>> Assurance
>>>>>>> Kliniekstraat 25 1070 Anderlecht Belgium
>>>>>>> 
>>>>>>> To call in the statistician after the experiment is done
>>>>>> may be no
>>>>>>> more than asking him to perform a post-mortem examination:
>>>>>> he may be
>>>>>>> able to say what the experiment died of. ~ Sir Ronald
>>>>>> Aylmer Fisher
>>>>>>> The plural of anecdote is not data. ~ Roger Brinner The
>>>>>>> combination of some data and an aching desire for an
>>>>>> answer does
>>>>>>> not ensure that a reasonable answer can be extracted from a
>>>>>> given body
>>>>>>> of data. ~ John Tukey
>>>>>>> 
>>>>>>> 2016-06-23 10:07 GMT+02:00 Philipp Singer
>>>>>> <killver at gmail.com <mailto:killver at gmail.com>
>>>>>> <mailto:killver at gmail.com <mailto:killver at gmail.com>>
>>>>>>> <mailto:killver at gmail.com <mailto:killver at gmail.com>
>>>>>>> <mailto:killver at gmail.com <mailto:killver at gmail.com>>>>:
>>>>>>> 
>>>>>>> Dear group - I am currently fitting a Poisson glmer
>>>>>> where I have
>>>>>>> an excess of outcomes that are zero (>95%). I am now
>>>>>> debating on
>>>>>>> how to proceed and came up with three options:
>>>>>>> 
>>>>>>> 1.) Just fit a regular glmer to the complete data. I am
>>>>>> not fully
>>>>>>> sure how interpret the coefficients then, are they more
>>>>>> optimizing
>>>>>>> towards distinguishing zero and non-zero, or also
>>>>>> capturing the
>>>>>>> differences in those outcomes that are non-zero?
>>>>>>> 
>>>>>>> 2.) Leave all zeros out of the data and fit a glmer to
>>>>>> only those
>>>>>>> outcomes that are non-zero. Then, I would only learn about
>>>>>>> differences in the non-zero outcomes though.
>>>>>>> 
>>>>>>> 3.) Use a zero-inflated Poisson model. My data is quite
>>>>>>> large-scale, so I am currently playing around with the EM
>>>>>>> implementation of Bolker et al. that alternates between
>>>>>> fitting a
>>>>>>> glmer with data that are weighted according to their zero
>>>>>>> probability, and fitting a logistic regression for the
>>>>>> probability
>>>>>>> that a data point is zero. The method is elaborated for
>>>>>> the OWL
>>>>>>> data in:
>>>>>>> 
>>>>>> https://groups.nceas.ucsb.edu/non-linear-modeling/projects/owls/WRITEUP/owls.pdf
>>>>>> <https://groups.nceas.ucsb.edu/non-linear-modeling/projects/owls/WRITEUP/owls.pdf>
>>>>>>> 
>> I am not fully sure how to interpret the results for the
>>>>>>> zero-inflated version though. Would I need to interpret the
>>>>>>> coefficients for the result of the glmer similar to as
>>>>>> I would do
>>>>>>> for my idea of 2)? And then on top of that interpret the
>>>>>>> coefficients for the logistic regression regarding whether
>>>>>>> something is in the perfect or imperfect state? I am
>>>>>> also not
>>>>>>> quite sure what the common approach for the zformula is
>>>>>> here. The
>>>>>>> OWL elaborations only use zformula=z~1, so no random
>>>>>> effect; I
>>>>>>> would use the same formula as for the glmer.
>>>>>>> 
>>>>>>> I am appreciating some help and pointers.
>>>>>>> 
>>>>>>> Thanks! Philipp
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> R-sig-mixed-models at r-project.org
>>>>>>> <mailto:R-sig-mixed-models at r-project.org>
>>>>>> <mailto:R-sig-mixed-models at r-project.org
>>>>>> <mailto:R-sig-mixed-models at r-project.org>>
>>>>>>> <mailto:R-sig-mixed-models at r-project.org
>>>>>>> <mailto:R-sig-mixed-models at r-project.org>
>>>>>> <mailto:R-sig-mixed-models at r-project.org
>>>>>> <mailto:R-sig-mixed-models at r-project.org>>> mailing list
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>>>> <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> [[alternative HTML version deleted]]
>>>>>> 
>>>>>> _______________________________________________
>>>>>> R-sig-mixed-models at r-project.org
>>>>>> <mailto:R-sig-mixed-models at r-project.org>
>>>>>> <mailto:R-sig-mixed-models at r-project.org
>>>>>> <mailto:R-sig-mixed-models at r-project.org>> mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>>> <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>>> [[alternative HTML version deleted]]
>>>> 
>>>> _______________________________________________
>>>> R-sig-mixed-models at r-project.org
>>>> <mailto:R-sig-mixed-models at r-project.org> mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>> <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
>>> [[alternative HTML version deleted]]
>>> 
>>> _______________________________________________
>>> R-sig-mixed-models at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>> 
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

	[[alternative HTML version deleted]]