[R-sig-ME] A zero inflated Poisson model in MCMCglmm
Jarrod Hadfield
j.hadfield at ed.ac.uk
Sun Jul 8 13:03:37 CEST 2012
Hi,
I agree with David that an ordinal model may behave better for this
type of data (or even a 0/1 response with deformed or not) but in case
you want to pursue a ZIP model...
The model you have specified is probably not what you had in mind. You
should think of the ZIP response as being two responses (traits): the
Poisson counts (trait 1) and the probability that a zero comes from
the zero-inflation process (trait 2). At the moment you have a common
intercept for both and therefore assume that the probability that a
zero is from the zero-inflation process (on the logit scale) is equal
to the mean Poisson count (on the log scale). There is probably little
justification for this (but see ZAP models). Moreover, you have a
single animal term, and therefore assume that both processes have the
same genetic variance and that the genetic correlation between them is
1. You also assume that the residual variance for each process is
equal (but the residual correlation is 0), although this is not
actually that bigger deal because residual variation in the
zero-inflation process (and residual corrletaion) is not identifiable
from the data and so the value it could take is essentially arbitrary.
However, I like to recognise this non-identifiability by fixing the
variance at 1 and the covariance at zero (but see ZAP models where
your specification is useful).
So, a more appropriate model would be:
model1.1<-MCMCglmm(Toes~trait,random=~us(trait):animal,family="zipoisson",rcov=~idh(trait):units,pedigree=ped,data=data,prior=prior1.1,nitt=500000,thin=500,burnin=200000,verbose=FALSE)
or perhaps
model1.2<-MCMCglmm(Toes~trait,random=~idh(trait):animal,family="zipoisson",rcov=~idh(trait):units,pedigree=ped,data=data,prior=prior1.1,nitt=500000,thin=500,burnin=200000,verbose=FALSE)
The difference between these two is that the genetic correlation
between the zero-inflation and the Poisson processes is estimated, and
in the latter it is set to 0.
I'm not sure about how much data you have, but I would think that if
the incidence of deformed toes is very low, you probably have little
information for some of the parameters. In this case you have to be
VERY careful with priors. The prior recommendations in the MCMCglmm
tutorial in Wilson et al. 2011 will generally be quite informative for
small or even moderate sized datasets (the authors have kindly
published an erratum as many people seem to be using their
recommendation without realising its implications).
The prior could look like:
prior=list(R=list(V=diag(2), nu=0, fix=2), G=list(G1=G1))
which fixes the residual variance in the zero-inflation process to 1
with a flat improper prior on the Poisson over-dispersion. For the
prior on the genetic (co)variances (G1) I generally use parameter
expanded priors of the form:
G1=list(V=diag(2), nu=2, alpha.mu=c(0,0), alpha.V=diag(2)*1000)
or
G1=list(V=diag(2), nu=1, alpha.mu=c(0,0), alpha.V=diag(2)*1000)
for model.2.
These generally give posterior modes for the variance components
that are close to (RE)ML estimates, although it is unclear to me how
they behave for the covariances and functions of variances (e.g.
heritability). At least for binary traits (e.g. the zero-inflation
bit) I have seen a chi-square prior recommended because of its prior
properties on the heritability (de Villemereuil in MEE). This type of
prior can also be obtained using parameter expansion:
(V=1, nu=1000, alpha.mu=0, alpha.V=1)
and I would guess that the multivariate analogue would be (V=diag(2),
nu=1001, alpha.mu=c(0,0), alpha.V=diag(2)). However, a chi-square
prior probably has poor properties for the Poisson variance and in
MCMCglmm you could only mix these different priors for model.2 using a
different syntax:
model1.2<-MCMCglmm(Toes~trait,random=~idh(at(trait,1)):animal+idh(at(trait,2)):animal,family="zipoisson",rcov=~idh(trait):units,pedigree=ped,data=data,prior=prior1.1,nitt=500000,thin=500,burnin=200000,verbose=FALSE)
G=list(
G1=list(V=1, nu=1, alpha.mu=0, alpha.V=1000),
G2=list(V=1, nu=1000, alpha.mu=0, alpha.V=1)
)
Cheers,
Jarrod
Quoting Helen Ward <h.l.ward at qmul.ac.uk> on Wed, 04 Jul 2012 16:24:36 +0100:
> Dear list,
>
> I work on bats, some of which have deformed toes. I have just started
> trying to use MCMCglmm to investigate whether there is hertiable
> variation for the number of deformed toes a bat has. Since most bats
> have 0 deformed toes I am using a zero-inflated Poisson model.
>
> I have put together some code (see below) based on the MCMCglmm tutorial
> from the Wilson et al. 2011 'Ecologist's guide to the animal model'
> paper, the MCMCglmm Course Notes and trial and error. This code runs and
> I get a model and a heritability estimate. However, the heritability
> estimate I get is extremely high and this, teamed with the observations
> that my posterior distributions of the variance componants are not
> pretty as the example ones I've seen and my autocorrelation values are
> high (~0.4), makes me suspicious that I am not modelling this trait very
> well. I suspect I have over simplified the model. I based my prior on a
> model done in ASReml modelling 'Toes' as a Poisson distribution, which
> can an heritability estimate of 0.35.
>
> In addition, when I try and add a second random variable - year of birth
> (Born) - into a second model, both plots and auto correlation get worse
> and my heritability estimate for number of deformed toes (Toes) goes
> from about 0.8 to 0.05!
>
> Finally, I suspect you're not, but I would like to know if you are
> allowed to compare DIC values from models with different priors please.
>
> I have pasted my code below. If anyone can offer constructive criticism
> it would be much appreciated.
>
> Many thanks,
> Helen
>
> Model 1: With number of deformed toes (Toes) as the only random variable
>
>
> p.var<-var(data$Toes,na.rm=TRUE)
> prior1.1<-list(G=list(G1=list(V=matrix(p.var*0.35),n=1)),
> R=list(V=matrix(p.var*0.65),n=1))
>
> model1.1<-MCMCglmm(Toes~1,random=~animal,family="zipoisson",rcov=~trait:units,pedigree=ped,data=data,prior=prior1.1,nitt=500000,thin=500,burnin=200000,verbose=FALSE)
> posterior.heritability1.1<-model1.1$VCV[,"animal"]/(model1.1$VCV[,"animal"]+model1.1$VCV[,"trait:units"])
>
> HPDinterval(posterior.heritability1.1,0.95)
> posterior.mode(posterior.heritability1.1)
>
>
> Model 2: With number of toes (Toes) and year of birth (Born) as random
> variables
>
>
> p.var<-var(data$Toes,na.rm=TRUE)
> prior1.2<-list(G=list(G1=list(V=matrix(p.var/3),n=1),G2=list(V=matrix(p.var/3),n=1)),R=list(V=matrix(p.var/3),n=1))
>
> model1.2<-
> MCMCglmm(Toes~1,random=~animal+Born,family="zipoisson",rcov=~trait:units,pedigree=ped,data=data,prior=prior1.2,nitt=1000000,thin=1000,burnin=500000,verbose=FALSE)
> posterior.heritability1.2<-model1.2$VCV[,"animal"]/(model1.2$VCV[,"animal"]+model1.2$VCV[,"Born"]+model1.2$VCV[,"trait:units"])
>
> HPDinterval(posterior.heritability1.2,0.95)
> posterior.mode(posterior.heritability1.2)
>
>
>
>
>
>
>
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
>
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
More information about the R-sig-mixed-models
mailing list