[R-sig-ME] mixed zero-inflated poisson regression

Highland Statistics Ltd highstat at highstat.com
Mon Nov 14 13:09:11 CET 2011





> ------------------------------
>
> Message: 2
> Date: Mon, 14 Nov 2011 09:57:11 +0100
> From: Ivar Herfindal<ivar.herfindal at bio.ntnu.no>
> To: r-sig-mixed-models at r-project.org
> Subject: [R-sig-ME] mixed zero-inflated poisson regression
> Message-ID:<4EC0D7E7.8020503 at bio.ntnu.no>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Dear all
>
> It seems like I have reached the limit of my statistical skills, and now
> turn to the mixed-list for some assistance.
>
> I have a dataset with a response variable that appears to have a
> distribution that is zero-inflated poisson (ZIP) (or close to). The
> response variable is number of fledglings from nests with known mothers.
> The number of fledglings has a lot of zeros that comes from failed
> breeding attempts. The number of fledgings can therefore be considered
> as a two-step process:
> 1. do the female breed, and
> 2. if the female breed, what was the number of fledglings.
>
> I want to relate the number of fledglings to some predictor variables,
> and I am particularly interested in the combined effect of these
> variables on the probability to breed and number of fledglings given
> breeding. I have several observations per female (multiple years) and
> each nest location is also represented by several females (in different
> years). I therefore need/want to account for random factors. I guess I
> could run two separate models (one binomial and one poisson), but I need
> to make predictions on how my explanatory variables affect the number of
> fledglings irrespective of the mechanism behind.
>
> I have searched R-sites a lot, and it found some packages that can model
> ZIP.
> 1. pscl (function zeroinfl)
> Pros: have the option to relate both the poisson and binomial process to
> the predictor variable, so I can evaluate how probability of breeding
> and number of fledglings depend on my predictor variables
> Cons: Can not include random factors.
>
> 2. glmmADMB (function glmmadmb), MCMCglmm, and gamlss
> Pros: allows random factors
> Cons: 1. all packages seems to assume that all zeros have the same
> probability of belonging to the zero component, i.e. the logit-link
> structure does not depend on the predictor variable.
> 2. For some of the packages (particularly MCMCglmm) I have some
> challenges about how to specify the model correctly.
>
> Does anyopne in the R(mixed) community have experience on this? I
> provide a simple example below to illustrate what I want.


Hello,
You can easily program this in WinBUGS (in R via R2WinBUGS). You can 
then chose whether you want to model the binary part as a function of 
covariates or not. The problem is that you are increasing the complexity 
of the model...depending on your sample size, model complexity and 
strength of the relationships, you may encounter mixing trouble. The 
extra problem that you may have is that the number of fledgings tends to 
be small...and may be underdispersed...but this may depends on the 
species. In that case you have underdispersion and zero 
inflation....have fun.


Code for ZIP in WinBUGS can be found in for example Kery (2010), and can 
easily be extended for GLMMs. Alternatively, our next book is a 360 
monograph on zero inflated models for nested data (and/or with 
spatial/temporal correlation). See:   http://www.highstat.com/book4.htm  
Comes out late December, early January....just in case you need a 
Christmas present for yourself

Kind regards,

Alain






> Ivar Herfindal
>
> ## A simple reproducible example
> library(pscl)
> library(glmmADMB)
> library(MCMCglmm)
> library(gamlss)
> set.seed(4242)
>
> # create some dummy data that have a lot of zeros in the dependent variable
> dataa<- cbind.data.frame(Y=rbinom(100, 1, 0.7))
> dataa$Y<- dataa$Y * rpois(100, 3) # the response
> dataa$A<- rnorm(100) # explanatory variable
> dataa$gr<- factor(1:10) # grouping variable
>
> # fit models with different packages
>
> ## First non-mixed models
> ### pscl zeroinfl
> zimod<- zeroinfl(Y ~ A, data=dataa, dist="poisson")
> # Get estimate of A on poisson and binomial process separate, this is
> what I want (I believe so...)
>
> zimod2<- zeroinfl(Y ~ A|1, data=dataa, dist="poisson")
> # All zero counts have the same probability of belonging to the zero
> component
>
> ### glmmADMB
> admod<- glmmadmb(Y ~ A, data=dataa, family="poisson", zeroInflation=TRUE)
> # Same results as zimod2.
> # Q: Is it possible to obtain similar results as zimod, i.e.
> # separate the effect of A on the poisson and binomial process?
>
> ### MCMCglmm
> mcmod<- MCMCglmm(Y ~ A, data=dataa, family="zipoisson",
> rcov=~idh(trait):units)
> # more or less same as admod and zimod2, but "plot(mcmod$Sol)" reveals
> large variation.
> # Can change burn and nitt to get more stable(?)
>
> mcmod2<- MCMCglmm(Y ~ A, data=dataa, family="zipoisson",
> rcov=~idh(trait):units, burn=10000, nitt=20000)
> # Q: Is it possible to separate the effect of A on the poisson and
> binomial process?
>
> ### gamlss
> gamod<- gamlss(Y ~ A, data=dataa, family=ZIP)
> # same as zimod2 and admod
>
> ## Some attempts on mixed models with gr as random factor
> admodm<- glmmadmb(Y ~ A, data=dataa, family="poisson",
> zeroInflation=TRUE, random=~(1|gr))
> mcmodm<- MCMCglmm(Y ~ A, data=dataa, family="zipoisson",
> rcov=~idh(trait):units, random=~idh(trait):gr)
> gamodm<- gamlss(Y ~ A + random(gr), data=dataa, family=ZIP)
> # For glmmadmb and gamlss I think the models are specified ok
> # For MCMCglmm I am not sure if it is specified correctly
> # But I still want to be able to separate the effect of A on the two
> processes.
>


-- 

Dr. Alain F. Zuur
First author of:

1. Analysing Ecological Data (2007).
Zuur, AF, Ieno, EN and Smith, GM. Springer. 680 p.
URL: www.springer.com/0-387-45967-7


2. Mixed effects models and extensions in ecology with R. (2009).
Zuur, AF, Ieno, EN, Walker, N, Saveliev, AA, and Smith, GM. Springer.
http://www.springer.com/life+sci/ecology/book/978-0-387-87457-9


3. A Beginner's Guide to R (2009).
Zuur, AF, Ieno, EN, Meesters, EHWG. Springer
http://www.springer.com/statistics/computational/book/978-0-387-93836-3


Other books: http://www.highstat.com/books.htm


Statistical consultancy, courses, data analysis and software
Highland Statistics Ltd.
6 Laverock road
UK - AB41 6FN Newburgh
Tel: 0044 1358 788177
Email: highstat at highstat.com
URL: www.highstat.com
URL: www.brodgar.com




More information about the R-sig-mixed-models mailing list