[R-sig-ME] mixed zero-inflated poisson regression

Ivar Herfindal ivar.herfindal at bio.ntnu.no
Thu Nov 17 15:54:41 CET 2011


Hi Jarrod (and all other r-sig-mixed-members who answered me)

First, thanks to all who replied. I learned a lot from all answers, and 
it seems I have got a better understanding on the topic now.

Jarrod Hadfields suggestion about priors helped me a lot to get betters 
models. As you suggested my models did not run well at all and there 
were several issues about my code. There were a lot of serious problems 
associated with the models (autocorrelation, correlation between chains 
from different parameters, poor convergence). Refining my code according 
to suggestions makes the models now run much better. I also centralised 
the predictor variable, which helped a lot on the correlation between 
the chains of the parameter estimates for the fixed factors. It also 
appears that the models have to run for some time in order to converge, 
and that I need a quite high thinning to reduce autocorrelation. 
Finally, using the priors suggested by Jarrod (with the 
fix=2-adjustment) of course did the trick and made models more stable 
and reliable. Currently I am running 3 MCMCglmm for each model, with 
nitt=1000000, burn=500000 and thin=1000. This seems to do give 
reasonable output (Gelman diagnostics < 1.02 for all chains), which are 
consistent among similar runs, although it takes some time.

As suggested by other list members, it may be a good idea to verify 
results using a MLE-based approach, for instance using ADMB. A recent 
(future) book was also suggested (even as a Christmas present for 
myself. I was thinking more in line of a pair of new skis, but perhaps a 
book about zero-inflated models and generalized linear mixed models 
would be just as fun). Link to the book:  http://www.highstat.com/book4.htm

Again, thanks to all for the help.

Ivar

On 16.11.2011 21:43, Jarrod Hadfield wrote:
> Hi Denise,
>
> Yes I did. Thanks for correcting it.
>
> Jarrod
>
> On 16 Nov 2011, at 19:58, Denise Rocha wrote:
>
>> Hi Dr. Jarrod.
>>
>> Didn't you want to say:
>>
>> "prior<-list(R=list(V=diag(2), nu=0.002, fix=*2*)..." ?
>>
>> Thank you.
>> Denise.
>>
>> 2011/11/14 Jarrod Hadfield <j.hadfield at ed.ac.uk 
>> <mailto:j.hadfield at ed.ac.uk>>
>>
>>     Hi,
>>
>>     ZIP models generally mix poorly and often need to be run for long
>>     periods of time. However, you seem to be estimating a residual
>>     variance for the zero-inflation bit and this is not identifiable.
>>     In this case you will never get convergence or good mixing. I
>>     recommend you fix the residual variance of the zero-inflation
>>     process at one using a prior like:
>>
>>     prior<-list(R=list(V=diag(2), nu=0.002, fix=1),
>>     G=list(G1=list(V=1, nu=1, alpha.mu <http://alpha.mu>=0,
>>     alpha.V=1000)))
>>
>>     I have used a parameter expansion for the random effects.
>>     However, your specification is a bit odd because you assume that
>>     the random effects for the poisson and zero-inflation have equal
>>     variance and a correlation of 1. ~us(trait):gr  or
>>      ~idh(trait):gr are probably more reasonable and you *could* use
>>     the priors:
>>
>>     prior<-list(R=list(V=diag(2), nu=0.002, fix=1),
>>     G=list(G1=list(V=diag(2), nu=2, alpha.mu
>>     <http://alpha.mu>=c(0,0), alpha.V=diag(2)*1000)))
>>
>>     for ~us(trait):gr, and
>>
>>     prior<-list(R=list(V=diag(2), nu=0.002, fix=1),
>>     G=list(G1=list(V=diag(2), nu=1, alpha.mu
>>     <http://alpha.mu>=c(0,0), alpha.V=diag(2)*1000)))
>>
>>     ~idh(trait):gr.
>>
>>     ~gr might make more sense if  you use a zero-altered poisson
>>     ("zapoisson") rather than a zero-inflated poisson.
>>
>>     Cheers,
>>
>>     Jarrod
>>
>>
>>
>>
>>     On 14 Nov 2011, at 16:17, Ivar Herfindal wrote:
>>
>>         Dear Thierry
>>
>>         Thank you for your quick response. I have previously tried to
>>         follow the CourseNote you suggested without success. I gave
>>         it another try today, and it seems I have manged to generate
>>         such a zero-inflated model as I want during the day. The
>>         syntax looks something like this:
>>
>>         mcmodm1 <- MCMCglmm(Y ~ trait -1 + at.level(trait, 1:2):A,
>>         data=dataa,
>>           family="zipoisson", rcov=~idh(trait):units, nitt=30000,
>>         random=~gr)
>>
>>         However, the main challenge is that the model seems quite
>>         unstable withmy "real" data. Even with 100000 iterations, two
>>         identical specified models gives quite different output, and
>>         plotting the mcmodm1$Sol suggest not very stable estimates.
>>         This occurs also on my simulated data.
>>
>>         Do anyone have any suggestion on how to make the model more
>>         stable (e.g. by the rcov or random arguments)? Or do I just
>>         have to run it as long as it takes to get "fairly precise"
>>         estimates?
>>
>>         Ivar
>>
>>         On 14.11.2011 10:40, ONKELINX, Thierry wrote:
>>
>>             Dear Ivar,
>>
>>             Your assumption on MCMCglmm is incorrect. You can specify
>>             different predictor for both the count and the binomial
>>             component. Have a look at the CourseNotes vignette for an
>>             example.
>>
>>             Best regards,
>>
>>             Thierry
>>
>>             ----------------------------------------------------------------------------
>>             ir. Thierry Onkelinx
>>             Instituut voor natuur- en bosonderzoek
>>             team Biometrie&  Kwaliteitszorg
>>             Gaverstraat 4
>>             9500 Geraardsbergen
>>             Belgium
>>
>>             Research Institute for Nature and Forest
>>             team Biometrics&  Quality Assurance
>>             Gaverstraat 4
>>             9500 Geraardsbergen
>>             Belgium
>>
>>             tel. + 32 54/436 185
>>             Thierry.Onkelinx at inbo.be <mailto:Thierry.Onkelinx at inbo.be>
>>             www.inbo.be <http://www.inbo.be>
>>
>>             To call in the statistician after the experiment is done
>>             may be no more than asking him to perform a post-mortem
>>             examination: he may be able to say what the experiment
>>             died of.
>>             ~ Sir Ronald Aylmer Fisher
>>
>>             The plural of anecdote is not data.
>>             ~ Roger Brinner
>>
>>             The combination of some data and an aching desire for an
>>             answer does not ensure that a reasonable answer can be
>>             extracted from a given body of data.
>>             ~ John Tukey
>>
>>                 -----Oorspronkelijk bericht-----
>>                 Van: r-sig-mixed-models-bounces at r-project.org
>>                 <mailto:r-sig-mixed-models-bounces at r-project.org>
>>                 [mailto:r-sig-mixed-models- <mailto:r-sig-mixed-models->
>>                 bounces at r-project.org <mailto:bounces at r-project.org>]
>>                 Namens Ivar Herfindal
>>                 Verzonden: maandag 14 november 2011 9:57
>>                 Aan: r-sig-mixed-models at r-project.org
>>                 <mailto:r-sig-mixed-models at r-project.org>
>>                 Onderwerp: [R-sig-ME] mixed zero-inflated poisson
>>                 regression
>>
>>                 Dear all
>>
>>                 It seems like I have reached the limit of my
>>                 statistical skills, and now turn to the
>>                 mixed-list for some assistance.
>>
>>                 I have a dataset with a response variable that
>>                 appears to have a distribution that
>>                 is zero-inflated poisson (ZIP) (or close to). The
>>                 response variable is number of
>>                 fledglings from nests with known mothers.
>>                 The number of fledglings has a lot of zeros that
>>                 comes from failed breeding
>>                 attempts. The number of fledgings can therefore be
>>                 considered as a two-step
>>                 process:
>>                 1. do the female breed, and
>>                 2. if the female breed, what was the number of
>>                 fledglings.
>>
>>                 I want to relate the number of fledglings to some
>>                 predictor variables, and I am
>>                 particularly interested in the combined effect of
>>                 these variables on the
>>                 probability to breed and number of fledglings given
>>                 breeding. I have several
>>                 observations per female (multiple years) and each
>>                 nest location is also
>>                 represented by several females (in different years).
>>                 I therefore need/want to
>>                 account for random factors. I guess I could run two
>>                 separate models (one
>>                 binomial and one poisson), but I need to make
>>                 predictions on how my
>>                 explanatory variables affect the number of fledglings
>>                 irrespective of the
>>                 mechanism behind.
>>
>>                 I have searched R-sites a lot, and it found some
>>                 packages that can model ZIP.
>>                 1. pscl (function zeroinfl)
>>                 Pros: have the option to relate both the poisson and
>>                 binomial process to the
>>                 predictor variable, so I can evaluate how probability
>>                 of breeding and number of
>>                 fledglings depend on my predictor variables
>>                 Cons: Can not include random factors.
>>
>>                 2. glmmADMB (function glmmadmb), MCMCglmm, and gamlss
>>                 Pros: allows random factors
>>                 Cons: 1. all packages seems to assume that all zeros
>>                 have the same probability
>>                 of belonging to the zero component, i.e. the
>>                 logit-link structure does not depend
>>                 on the predictor variable.
>>                 2. For some of the packages (particularly MCMCglmm) I
>>                 have some challenges
>>                 about how to specify the model correctly.
>>
>>                 Does anyopne in the R(mixed) community have
>>                 experience on this? I provide a
>>                 simple example below to illustrate what I want.
>>
>>                 Ivar Herfindal
>>
>>                 ## A simple reproducible example
>>                 library(pscl)
>>                 library(glmmADMB)
>>                 library(MCMCglmm)
>>                 library(gamlss)
>>                 set.seed(4242)
>>
>>                 # create some dummy data that have a lot of zeros in
>>                 the dependent variable
>>                 dataa<- cbind.data.frame(Y=rbinom(100, 1, 0.7))
>>                 dataa$Y<- dataa$Y *
>>                 rpois(100, 3) # the response dataa$A<- rnorm(100) #
>>                 explanatory variable
>>                 dataa$gr<- factor(1:10) # grouping variable
>>
>>                 # fit models with different packages
>>
>>                 ## First non-mixed models
>>                 ### pscl zeroinfl
>>                 zimod<- zeroinfl(Y ~ A, data=dataa, dist="poisson") #
>>                 Get estimate of A on
>>                 poisson and binomial process separate, this is what I
>>                 want (I believe so...)
>>
>>                 zimod2<- zeroinfl(Y ~ A|1, data=dataa,
>>                 dist="poisson") # All zero counts have
>>                 the same probability of belonging to the zero component
>>
>>                 ### glmmADMB
>>                 admod<- glmmadmb(Y ~ A, data=dataa, family="poisson",
>>                 zeroInflation=TRUE)
>>                 # Same results as zimod2.
>>                 # Q: Is it possible to obtain similar results as
>>                 zimod, i.e.
>>                 # separate the effect of A on the poisson and
>>                 binomial process?
>>
>>                 ### MCMCglmm
>>                 mcmod<- MCMCglmm(Y ~ A, data=dataa, family="zipoisson",
>>                 rcov=~idh(trait):units)
>>                 # more or less same as admod and zimod2, but
>>                 "plot(mcmod$Sol)" reveals large
>>                 variation.
>>                 # Can change burn and nitt to get more stable(?)
>>
>>                 mcmod2<- MCMCglmm(Y ~ A, data=dataa, family="zipoisson",
>>                 rcov=~idh(trait):units, burn=10000, nitt=20000) # Q:
>>                 Is it possible to separate the
>>                 effect of A on the poisson and binomial process?
>>
>>                 ### gamlss
>>                 gamod<- gamlss(Y ~ A, data=dataa, family=ZIP) # same
>>                 as zimod2 and admod
>>
>>                 ## Some attempts on mixed models with gr as random
>>                 factor admodm<-
>>                 glmmadmb(Y ~ A, data=dataa, family="poisson",
>>                 zeroInflation=TRUE,
>>                 random=~(1|gr)) mcmodm<- MCMCglmm(Y ~ A, data=dataa,
>>                 family="zipoisson", rcov=~idh(trait):units,
>>                 random=~idh(trait):gr) gamodm<-
>>                 gamlss(Y ~ A + random(gr), data=dataa, family=ZIP) #
>>                 For glmmadmb and gamlss
>>                 I think the models are specified ok # For MCMCglmm I
>>                 am not sure if it is
>>                 specified correctly # But I still want to be able to
>>                 separate the effect of A on the
>>                 two processes.
>>
>>                 --
>>                 --------------------------------------------------
>>                 Ivar Herfindal (PhD, Researcher)
>>                 Centre for Conservation Biology
>>                 Department of Biology
>>                 Norwegian University for Science and Technology
>>                 N-7491 Trondheim, Norway
>>
>>                 email: Ivar.Herfindal at bio.ntnu.no
>>                 <mailto:Ivar.Herfindal at bio.ntnu.no>
>>                 web: http://www.ntnu.no/employees/ivar.herfindal
>>                 phone:     +47 73596253
>>                 Cellphone: +47 91625333
>>                 fax:       +47 73596090
>>
>>                 _______________________________________________
>>                 R-sig-mixed-models at r-project.org
>>                 <mailto:R-sig-mixed-models at r-project.org> mailing list
>>                 https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>>
>>         -- 
>>         --------------------------------------------------
>>         Ivar Herfindal (PhD, Researcher)
>>         Centre for Conservation Biology
>>         Department of Biology
>>         Norwegian University for Science and Technology
>>         N-7491 Trondheim, Norway
>>
>>         email: Ivar.Herfindal at bio.ntnu.no
>>         <mailto:Ivar.Herfindal at bio.ntnu.no>
>>         web: http://www.ntnu.no/employees/ivar.herfindal
>>         phone:     +47 73596253
>>         Cellphone: +47 91625333
>>         fax:       +47 73596090
>>
>>         _______________________________________________
>>         R-sig-mixed-models at r-project.org
>>         <mailto:R-sig-mixed-models at r-project.org> mailing list
>>         https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>>
>>
>>     -- 
>>     The University of Edinburgh is a charitable body, registered in
>>     Scotland, with registration number SC005336.
>>
>>
>>     _______________________________________________
>>     R-sig-mixed-models at r-project.org
>>     <mailto:R-sig-mixed-models at r-project.org> mailing list
>>     https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>>
>>
>>
>> -- 
>> *Denise Rocha Ayres*
>> Mestre em Genética e Melhoramento Animal
>> Doutoranda em Genética e Melhoramento Animal
>> UNESP-Câmpus de Jaboticabal
>>
>
>
>
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.

-- 
--------------------------------------------------
Ivar Herfindal (PhD, Researcher)
Centre for Conservation Biology
Department of Biology
Norwegian University for Science and Technology
N-7491 Trondheim, Norway

email:     Ivar.Herfindal at bio.ntnu.no
web:       http://www.ntnu.no/employees/ivar.herfindal
phone:     +47 73596253
Cellphone: +47 91625333
fax:       +47 73596090




More information about the R-sig-mixed-models mailing list