[R-sig-ME] mixed zero-inflated poisson regression

Mon Nov 14 18:00:59 CET 2011

Hi,

ZIP models generally mix poorly and often need to be run for long  
periods of time. However, you seem to be estimating a residual  
variance for the zero-inflation bit and this is not identifiable. In  
this case you will never get convergence or good mixing. I recommend  
you fix the residual variance of the zero-inflation process at one  
using a prior like:

prior<-list(R=list(V=diag(2), nu=0.002, fix=1), G=list(G1=list(V=1,  
nu=1, alpha.mu=0, alpha.V=1000)))

I have used a parameter expansion for the random effects. However,  
your specification is a bit odd because you assume that the random  
effects for the poisson and zero-inflation have equal variance and a  
correlation of 1. ~us(trait):gr  or  ~idh(trait):gr are probably more  
reasonable and you *could* use the priors:

prior<-list(R=list(V=diag(2), nu=0.002, fix=1),  
G=list(G1=list(V=diag(2), nu=2, alpha.mu=c(0,0), alpha.V=diag(2)*1000)))

for ~us(trait):gr, and

prior<-list(R=list(V=diag(2), nu=0.002, fix=1),  
G=list(G1=list(V=diag(2), nu=1, alpha.mu=c(0,0), alpha.V=diag(2)*1000)))

~idh(trait):gr.

~gr might make more sense if  you use a zero-altered poisson  
("zapoisson") rather than a zero-inflated poisson.

Cheers,

Jarrod

On 14 Nov 2011, at 16:17, Ivar Herfindal wrote:

> Dear Thierry
>
> Thank you for your quick response. I have previously tried to follow  
> the CourseNote you suggested without success. I gave it another try  
> today, and it seems I have manged to generate such a zero-inflated  
> model as I want during the day. The syntax looks something like this:
>
> mcmodm1 <- MCMCglmm(Y ~ trait -1 + at.level(trait, 1:2):A, data=dataa,
>    family="zipoisson", rcov=~idh(trait):units, nitt=30000, random=~gr)
>
> However, the main challenge is that the model seems quite unstable  
> withmy "real" data. Even with 100000 iterations, two identical  
> specified models gives quite different output, and plotting the  
> mcmodm1$Sol suggest not very stable estimates. This occurs also on  
> my simulated data.
>
> Do anyone have any suggestion on how to make the model more stable  
> (e.g. by the rcov or random arguments)? Or do I just have to run it  
> as long as it takes to get "fairly precise" estimates?
>
> Ivar
>
> On 14.11.2011 10:40, ONKELINX, Thierry wrote:
>> Dear Ivar,
>>
>> Your assumption on MCMCglmm is incorrect. You can specify different  
>> predictor for both the count and the binomial component. Have a  
>> look at the CourseNotes vignette for an example.
>>
>> Best regards,
>>
>> Thierry
>>
>> ----------------------------------------------------------------------------
>> ir. Thierry Onkelinx
>> Instituut voor natuur- en bosonderzoek
>> team Biometrie&  Kwaliteitszorg
>> Gaverstraat 4
>> 9500 Geraardsbergen
>> Belgium
>>
>> Research Institute for Nature and Forest
>> team Biometrics&  Quality Assurance
>> Gaverstraat 4
>> 9500 Geraardsbergen
>> Belgium
>>
>> tel. + 32 54/436 185
>> Thierry.Onkelinx at inbo.be
>> www.inbo.be
>>
>> To call in the statistician after the experiment is done may be no  
>> more than asking him to perform a post-mortem examination: he may  
>> be able to say what the experiment died of.
>> ~ Sir Ronald Aylmer Fisher
>>
>> The plural of anecdote is not data.
>> ~ Roger Brinner
>>
>> The combination of some data and an aching desire for an answer  
>> does not ensure that a reasonable answer can be extracted from a  
>> given body of data.
>> ~ John Tukey
>>
>>> -----Oorspronkelijk bericht-----
>>> Van: r-sig-mixed-models-bounces at r-project.org [mailto:r-sig-mixed-models-
>>> bounces at r-project.org] Namens Ivar Herfindal
>>> Verzonden: maandag 14 november 2011 9:57
>>> Aan: r-sig-mixed-models at r-project.org
>>> Onderwerp: [R-sig-ME] mixed zero-inflated poisson regression
>>>
>>> Dear all
>>>
>>> It seems like I have reached the limit of my statistical skills,  
>>> and now turn to the
>>> mixed-list for some assistance.
>>>
>>> I have a dataset with a response variable that appears to have a  
>>> distribution that
>>> is zero-inflated poisson (ZIP) (or close to). The response  
>>> variable is number of
>>> fledglings from nests with known mothers.
>>> The number of fledglings has a lot of zeros that comes from failed  
>>> breeding
>>> attempts. The number of fledgings can therefore be considered as a  
>>> two-step
>>> process:
>>> 1. do the female breed, and
>>> 2. if the female breed, what was the number of fledglings.
>>>
>>> I want to relate the number of fledglings to some predictor  
>>> variables, and I am
>>> particularly interested in the combined effect of these variables  
>>> on the
>>> probability to breed and number of fledglings given breeding. I  
>>> have several
>>> observations per female (multiple years) and each nest location is  
>>> also
>>> represented by several females (in different years). I therefore  
>>> need/want to
>>> account for random factors. I guess I could run two separate  
>>> models (one
>>> binomial and one poisson), but I need to make predictions on how my
>>> explanatory variables affect the number of fledglings irrespective  
>>> of the
>>> mechanism behind.
>>>
>>> I have searched R-sites a lot, and it found some packages that can  
>>> model ZIP.
>>> 1. pscl (function zeroinfl)
>>> Pros: have the option to relate both the poisson and binomial  
>>> process to the
>>> predictor variable, so I can evaluate how probability of breeding  
>>> and number of
>>> fledglings depend on my predictor variables
>>> Cons: Can not include random factors.
>>>
>>> 2. glmmADMB (function glmmadmb), MCMCglmm, and gamlss
>>> Pros: allows random factors
>>> Cons: 1. all packages seems to assume that all zeros have the same  
>>> probability
>>> of belonging to the zero component, i.e. the logit-link structure  
>>> does not depend
>>> on the predictor variable.
>>> 2. For some of the packages (particularly MCMCglmm) I have some  
>>> challenges
>>> about how to specify the model correctly.
>>>
>>> Does anyopne in the R(mixed) community have experience on this? I  
>>> provide a
>>> simple example below to illustrate what I want.
>>>
>>> Ivar Herfindal
>>>
>>> ## A simple reproducible example
>>> library(pscl)
>>> library(glmmADMB)
>>> library(MCMCglmm)
>>> library(gamlss)
>>> set.seed(4242)
>>>
>>> # create some dummy data that have a lot of zeros in the dependent  
>>> variable
>>> dataa<- cbind.data.frame(Y=rbinom(100, 1, 0.7)) dataa$Y<- dataa$Y *
>>> rpois(100, 3) # the response dataa$A<- rnorm(100) # explanatory  
>>> variable
>>> dataa$gr<- factor(1:10) # grouping variable
>>>
>>> # fit models with different packages
>>>
>>> ## First non-mixed models
>>> ### pscl zeroinfl
>>> zimod<- zeroinfl(Y ~ A, data=dataa, dist="poisson") # Get estimate  
>>> of A on
>>> poisson and binomial process separate, this is what I want (I  
>>> believe so...)
>>>
>>> zimod2<- zeroinfl(Y ~ A|1, data=dataa, dist="poisson") # All zero  
>>> counts have
>>> the same probability of belonging to the zero component
>>>
>>> ### glmmADMB
>>> admod<- glmmadmb(Y ~ A, data=dataa, family="poisson",  
>>> zeroInflation=TRUE)
>>> # Same results as zimod2.
>>> # Q: Is it possible to obtain similar results as zimod, i.e.
>>> # separate the effect of A on the poisson and binomial process?
>>>
>>> ### MCMCglmm
>>> mcmod<- MCMCglmm(Y ~ A, data=dataa, family="zipoisson",
>>> rcov=~idh(trait):units)
>>> # more or less same as admod and zimod2, but "plot(mcmod$Sol)"  
>>> reveals large
>>> variation.
>>> # Can change burn and nitt to get more stable(?)
>>>
>>> mcmod2<- MCMCglmm(Y ~ A, data=dataa, family="zipoisson",
>>> rcov=~idh(trait):units, burn=10000, nitt=20000) # Q: Is it  
>>> possible to separate the
>>> effect of A on the poisson and binomial process?
>>>
>>> ### gamlss
>>> gamod<- gamlss(Y ~ A, data=dataa, family=ZIP) # same as zimod2 and  
>>> admod
>>>
>>> ## Some attempts on mixed models with gr as random factor admodm<-
>>> glmmadmb(Y ~ A, data=dataa, family="poisson", zeroInflation=TRUE,
>>> random=~(1|gr)) mcmodm<- MCMCglmm(Y ~ A, data=dataa,
>>> family="zipoisson", rcov=~idh(trait):units, random=~idh(trait):gr)  
>>> gamodm<-
>>> gamlss(Y ~ A + random(gr), data=dataa, family=ZIP) # For glmmadmb  
>>> and gamlss
>>> I think the models are specified ok # For MCMCglmm I am not sure  
>>> if it is
>>> specified correctly # But I still want to be able to separate the  
>>> effect of A on the
>>> two processes.
>>>
>>> --
>>> --------------------------------------------------
>>> Ivar Herfindal (PhD, Researcher)
>>> Centre for Conservation Biology
>>> Department of Biology
>>> Norwegian University for Science and Technology
>>> N-7491 Trondheim, Norway
>>>
>>> email:     Ivar.Herfindal at bio.ntnu.no
>>> web:       http://www.ntnu.no/employees/ivar.herfindal
>>> phone:     +47 73596253
>>> Cellphone: +47 91625333
>>> fax:       +47 73596090
>>>
>>> _______________________________________________
>>> R-sig-mixed-models at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
> -- 
> --------------------------------------------------
> Ivar Herfindal (PhD, Researcher)
> Centre for Conservation Biology
> Department of Biology
> Norwegian University for Science and Technology
> N-7491 Trondheim, Norway
>
> email:     Ivar.Herfindal at bio.ntnu.no
> web:       http://www.ntnu.no/employees/ivar.herfindal
> phone:     +47 73596253
> Cellphone: +47 91625333
> fax:       +47 73596090
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.