[R-sig-ME] mixed zero-inflated poisson regression (Ivar Herfindal)

Julie Black Julie.Black at jncc.gov.uk
Mon Nov 14 12:17:55 CET 2011


Another option, which I am afraid I can't give too specific help with,
is to look at R2Winbugs package. It is a R package which uses Winbugs,
so uses MCMC chains. The pro of this is that you can do everything you
want to do, ie you can include random terms, and the covariates can
affect both the probability that you will have a zero and also the count
of fledglings. 
The con is I find it very difficult to code, it is not a quick and easy
solution so you might be better preserving with MCMCglmm if it is
capable of modelling what you want. 
----------

Message: 2
Date: Mon, 14 Nov 2011 09:57:11 +0100
From: Ivar Herfindal <ivar.herfindal at bio.ntnu.no>
To: r-sig-mixed-models at r-project.org
Subject: [R-sig-ME] mixed zero-inflated poisson regression
Message-ID: <4EC0D7E7.8020503 at bio.ntnu.no>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Dear all

It seems like I have reached the limit of my statistical skills, and now

turn to the mixed-list for some assistance.

I have a dataset with a response variable that appears to have a 
distribution that is zero-inflated poisson (ZIP) (or close to). The 
response variable is number of fledglings from nests with known mothers.

The number of fledglings has a lot of zeros that comes from failed 
breeding attempts. The number of fledgings can therefore be considered 
as a two-step process:
1. do the female breed, and
2. if the female breed, what was the number of fledglings.

I want to relate the number of fledglings to some predictor variables, 
and I am particularly interested in the combined effect of these 
variables on the probability to breed and number of fledglings given 
breeding. I have several observations per female (multiple years) and 
each nest location is also represented by several females (in different 
years). I therefore need/want to account for random factors. I guess I 
could run two separate models (one binomial and one poisson), but I need

to make predictions on how my explanatory variables affect the number of

fledglings irrespective of the mechanism behind.

I have searched R-sites a lot, and it found some packages that can model

ZIP.
1. pscl (function zeroinfl)
Pros: have the option to relate both the poisson and binomial process to

the predictor variable, so I can evaluate how probability of breeding 
and number of fledglings depend on my predictor variables
Cons: Can not include random factors.

2. glmmADMB (function glmmadmb), MCMCglmm, and gamlss
Pros: allows random factors
Cons: 1. all packages seems to assume that all zeros have the same 
probability of belonging to the zero component, i.e. the logit-link 
structure does not depend on the predictor variable.
2. For some of the packages (particularly MCMCglmm) I have some 
challenges about how to specify the model correctly.

Does anyopne in the R(mixed) community have experience on this? I 
provide a simple example below to illustrate what I want.

Ivar Herfindal

## A simple reproducible example
library(pscl)
library(glmmADMB)
library(MCMCglmm)
library(gamlss)
set.seed(4242)

# create some dummy data that have a lot of zeros in the dependent
variable
dataa <- cbind.data.frame(Y=rbinom(100, 1, 0.7))
dataa$Y <- dataa$Y * rpois(100, 3) # the response
dataa$A <- rnorm(100) # explanatory variable
dataa$gr <- factor(1:10) # grouping variable

# fit models with different packages

## First non-mixed models
### pscl zeroinfl
zimod <- zeroinfl(Y ~ A, data=dataa, dist="poisson")
# Get estimate of A on poisson and binomial process separate, this is 
what I want (I believe so...)

zimod2 <- zeroinfl(Y ~ A|1, data=dataa, dist="poisson")
# All zero counts have the same probability of belonging to the zero 
component

### glmmADMB
admod <- glmmadmb(Y ~ A, data=dataa, family="poisson",
zeroInflation=TRUE)
# Same results as zimod2.
# Q: Is it possible to obtain similar results as zimod, i.e.
# separate the effect of A on the poisson and binomial process?

### MCMCglmm
mcmod <- MCMCglmm(Y ~ A, data=dataa, family="zipoisson", 
rcov=~idh(trait):units)
# more or less same as admod and zimod2, but "plot(mcmod$Sol)" reveals 
large variation.
# Can change burn and nitt to get more stable(?)

mcmod2 <- MCMCglmm(Y ~ A, data=dataa, family="zipoisson", 
rcov=~idh(trait):units, burn=10000, nitt=20000)
# Q: Is it possible to separate the effect of A on the poisson and 
binomial process?

### gamlss
gamod <- gamlss(Y ~ A, data=dataa, family=ZIP)
# same as zimod2 and admod

## Some attempts on mixed models with gr as random factor
admodm <- glmmadmb(Y ~ A, data=dataa, family="poisson", 
zeroInflation=TRUE, random=~(1|gr))
mcmodm <- MCMCglmm(Y ~ A, data=dataa, family="zipoisson", 
rcov=~idh(trait):units, random=~idh(trait):gr)
gamodm <- gamlss(Y ~ A + random(gr), data=dataa, family=ZIP)
# For glmmadmb and gamlss I think the models are specified ok
# For MCMCglmm I am not sure if it is specified correctly
# But I still want to be able to separate the effect of A on the two 
processes.

-- 
--------------------------------------------------
Ivar Herfindal (PhD, Researcher)
Centre for Conservation Biology
Department of Biology
Norwegian University for Science and Technology
N-7491 Trondheim, Norway

email:     Ivar.Herfindal at bio.ntnu.no
web:       http://www.ntnu.no/employees/ivar.herfindal
phone:     +47 73596253
Cellphone: +47 91625333
fax:       +47 73596090



------------------------------

Message: 3
Date: Mon, 14 Nov 2011 09:40:49 +0000
From: "ONKELINX, Thierry" <Thierry.ONKELINX at inbo.be>
To: "ivar.herfindal at bio.ntnu.no" <ivar.herfindal at bio.ntnu.no>,
	"r-sig-mixed-models at r-project.org"
<r-sig-mixed-models at r-project.org>
Subject: Re: [R-sig-ME] mixed zero-inflated poisson regression
Message-ID:
	<AA818EAD2576BC488B4F623941DA74275556C5A8 at inbomail.inbo.be>
Content-Type: text/plain; charset="us-ascii"

Dear Ivar,

Your assumption on MCMCglmm is incorrect. You can specify different
predictor for both the count and the binomial component. Have a look at
the CourseNotes vignette for an example.

Best regards,

Thierry

------------------------------------------------------------------------
----
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek
team Biometrie & Kwaliteitszorg
Gaverstraat 4
9500 Geraardsbergen
Belgium

Research Institute for Nature and Forest
team Biometrics & Quality Assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium

tel. + 32 54/436 185
Thierry.Onkelinx at inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey

> -----Oorspronkelijk bericht-----
> Van: r-sig-mixed-models-bounces at r-project.org
[mailto:r-sig-mixed-models-
> bounces at r-project.org] Namens Ivar Herfindal
> Verzonden: maandag 14 november 2011 9:57
> Aan: r-sig-mixed-models at r-project.org
> Onderwerp: [R-sig-ME] mixed zero-inflated poisson regression
> 
> Dear all
> 
> It seems like I have reached the limit of my statistical skills, and
now turn to the
> mixed-list for some assistance.
> 
> I have a dataset with a response variable that appears to have a
distribution that
> is zero-inflated poisson (ZIP) (or close to). The response variable is
number of
> fledglings from nests with known mothers.
> The number of fledglings has a lot of zeros that comes from failed
breeding
> attempts. The number of fledgings can therefore be considered as a
two-step
> process:
> 1. do the female breed, and
> 2. if the female breed, what was the number of fledglings.
> 
> I want to relate the number of fledglings to some predictor variables,
and I am
> particularly interested in the combined effect of these variables on
the
> probability to breed and number of fledglings given breeding. I have
several
> observations per female (multiple years) and each nest location is
also
> represented by several females (in different years). I therefore
need/want to
> account for random factors. I guess I could run two separate models
(one
> binomial and one poisson), but I need to make predictions on how my
> explanatory variables affect the number of fledglings irrespective of
the
> mechanism behind.
> 
> I have searched R-sites a lot, and it found some packages that can
model ZIP.
> 1. pscl (function zeroinfl)
> Pros: have the option to relate both the poisson and binomial process
to the
> predictor variable, so I can evaluate how probability of breeding and
number of
> fledglings depend on my predictor variables
> Cons: Can not include random factors.
> 
> 2. glmmADMB (function glmmadmb), MCMCglmm, and gamlss
> Pros: allows random factors
> Cons: 1. all packages seems to assume that all zeros have the same
probability
> of belonging to the zero component, i.e. the logit-link structure does
not depend
> on the predictor variable.
> 2. For some of the packages (particularly MCMCglmm) I have some
challenges
> about how to specify the model correctly.
> 
> Does anyopne in the R(mixed) community have experience on this? I
provide a
> simple example below to illustrate what I want.
> 
> Ivar Herfindal
> 
> ## A simple reproducible example
> library(pscl)
> library(glmmADMB)
> library(MCMCglmm)
> library(gamlss)
> set.seed(4242)
> 
> # create some dummy data that have a lot of zeros in the dependent
variable
> dataa <- cbind.data.frame(Y=rbinom(100, 1, 0.7)) dataa$Y <- dataa$Y *
> rpois(100, 3) # the response dataa$A <- rnorm(100) # explanatory
variable
> dataa$gr <- factor(1:10) # grouping variable
> 
> # fit models with different packages
> 
> ## First non-mixed models
> ### pscl zeroinfl
> zimod <- zeroinfl(Y ~ A, data=dataa, dist="poisson") # Get estimate of
A on
> poisson and binomial process separate, this is what I want (I believe
so...)
> 
> zimod2 <- zeroinfl(Y ~ A|1, data=dataa, dist="poisson") # All zero
counts have
> the same probability of belonging to the zero component
> 
> ### glmmADMB
> admod <- glmmadmb(Y ~ A, data=dataa, family="poisson",
zeroInflation=TRUE)
> # Same results as zimod2.
> # Q: Is it possible to obtain similar results as zimod, i.e.
> # separate the effect of A on the poisson and binomial process?
> 
> ### MCMCglmm
> mcmod <- MCMCglmm(Y ~ A, data=dataa, family="zipoisson",
> rcov=~idh(trait):units)
> # more or less same as admod and zimod2, but "plot(mcmod$Sol)" reveals
large
> variation.
> # Can change burn and nitt to get more stable(?)
> 
> mcmod2 <- MCMCglmm(Y ~ A, data=dataa, family="zipoisson",
> rcov=~idh(trait):units, burn=10000, nitt=20000) # Q: Is it possible to
separate the
> effect of A on the poisson and binomial process?
> 
> ### gamlss
> gamod <- gamlss(Y ~ A, data=dataa, family=ZIP) # same as zimod2 and
admod
> 
> ## Some attempts on mixed models with gr as random factor admodm <-
> glmmadmb(Y ~ A, data=dataa, family="poisson", zeroInflation=TRUE,
> random=~(1|gr)) mcmodm <- MCMCglmm(Y ~ A, data=dataa,
> family="zipoisson", rcov=~idh(trait):units, random=~idh(trait):gr)
gamodm <-
> gamlss(Y ~ A + random(gr), data=dataa, family=ZIP) # For glmmadmb and
gamlss
> I think the models are specified ok # For MCMCglmm I am not sure if it
is
> specified correctly # But I still want to be able to separate the
effect of A on the
> two processes.
> 
> --
> --------------------------------------------------
> Ivar Herfindal (PhD, Researcher)
> Centre for Conservation Biology
> Department of Biology
> Norwegian University for Science and Technology
> N-7491 Trondheim, Norway
> 
> email:     Ivar.Herfindal at bio.ntnu.no
> web:       http://www.ntnu.no/employees/ivar.herfindal
> phone:     +47 73596253
> Cellphone: +47 91625333
> fax:       +47 73596090
> 
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models



------------------------------

_______________________________________________
R-sig-mixed-models mailing list
R-sig-mixed-models at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models


End of R-sig-mixed-models Digest, Vol 59, Issue 17
**************************************************

_____________________________________________________________________
The Joint Nature Conservation Committee (JNCC) is the statutory adviser to Government on UK and international nature conservation, on behalf of the Council for Nature Conservation and the Countryside, the Countryside Council for Wales, Natural England and Scottish Natural Heritage.  Its work contributes to maintaining and enriching biological diversity, conserving geological features and sustaining natural systems.

JNCC SUPPORT CO. Registered in England and Wales, company no. 05380206. Registered office:  Monkstone House, City Road, Peterborough, Cambridgeshire PE1 1JY


This message has been checked for all known viruses by JNCC delivered through the MessageLabs Virus Control Centre.




More information about the R-sig-mixed-models mailing list