[R-sig-ME] A zero inflated Poisson model in MCMCglmm

Jarrod Hadfield j.hadfield at ed.ac.uk
Sun Jul 8 13:03:37 CEST 2012


Hi,

I agree with David that an ordinal model may behave better for this  
type of data (or even a 0/1 response with deformed or not) but in case  
you want to pursue a ZIP model...

The model you have specified is probably not what you had in mind. You  
should think of the ZIP response as being two responses (traits): the  
Poisson counts (trait 1) and the probability that a zero comes from  
the zero-inflation process (trait 2). At the moment you have a common  
intercept for both and therefore assume that the probability that a  
zero is from the zero-inflation process (on the logit scale) is equal  
to the mean Poisson count (on the log scale). There is probably little  
justification for this (but see ZAP models). Moreover, you have a  
single animal term, and therefore assume that both processes have the  
same genetic variance and that the genetic correlation between them is  
1. You also assume that the residual variance for each process is  
equal (but the residual correlation is 0), although this is not  
actually that bigger deal because residual variation in the  
zero-inflation process (and residual corrletaion) is not identifiable  
from the data and so the value it could take is essentially arbitrary.  
However, I like to recognise this non-identifiability by fixing the  
variance at 1 and the covariance at zero (but see ZAP models where  
your specification is useful).

So, a more appropriate model would be:

model1.1<-MCMCglmm(Toes~trait,random=~us(trait):animal,family="zipoisson",rcov=~idh(trait):units,pedigree=ped,data=data,prior=prior1.1,nitt=500000,thin=500,burnin=200000,verbose=FALSE)

or perhaps

model1.2<-MCMCglmm(Toes~trait,random=~idh(trait):animal,family="zipoisson",rcov=~idh(trait):units,pedigree=ped,data=data,prior=prior1.1,nitt=500000,thin=500,burnin=200000,verbose=FALSE)

The difference between these two is that the genetic correlation  
between the zero-inflation and the Poisson processes is estimated, and  
in the latter it is set to 0.

I'm not sure about how much data you have, but I would think that if  
the incidence of deformed toes is very low, you probably have little  
information for some of the parameters. In this case you have to be  
VERY careful with priors. The prior recommendations in the MCMCglmm  
tutorial in Wilson et al. 2011 will generally be quite informative for  
small or even moderate sized datasets (the authors have kindly  
published an erratum as many people seem to be using their  
recommendation without realising its implications).

The prior could look like:

prior=list(R=list(V=diag(2), nu=0, fix=2), G=list(G1=G1))

which fixes the residual variance in the zero-inflation process to 1  
with a flat improper prior on the Poisson over-dispersion. For the  
prior on the genetic (co)variances (G1) I generally use parameter  
expanded priors of the form:

G1=list(V=diag(2), nu=2, alpha.mu=c(0,0), alpha.V=diag(2)*1000)

or

G1=list(V=diag(2), nu=1, alpha.mu=c(0,0), alpha.V=diag(2)*1000)

for model.2.

  These generally give posterior modes for the variance components  
that are close to (RE)ML estimates, although it is unclear to me how  
they behave for the covariances and functions of variances (e.g.  
heritability). At least for binary traits (e.g. the zero-inflation  
bit) I have seen a chi-square prior recommended because of its  prior  
properties on the heritability (de Villemereuil in MEE). This type of  
prior can also be obtained using parameter expansion:

(V=1, nu=1000, alpha.mu=0, alpha.V=1)

and I would guess that the multivariate analogue would be (V=diag(2),  
nu=1001, alpha.mu=c(0,0), alpha.V=diag(2)). However, a chi-square  
prior probably has poor properties for the Poisson variance and in  
MCMCglmm you could only mix these different priors for model.2 using a  
different syntax:

model1.2<-MCMCglmm(Toes~trait,random=~idh(at(trait,1)):animal+idh(at(trait,2)):animal,family="zipoisson",rcov=~idh(trait):units,pedigree=ped,data=data,prior=prior1.1,nitt=500000,thin=500,burnin=200000,verbose=FALSE)

G=list(
     G1=list(V=1, nu=1, alpha.mu=0, alpha.V=1000),
     G2=list(V=1, nu=1000, alpha.mu=0, alpha.V=1)
   )


Cheers,

Jarrod




Quoting Helen Ward <h.l.ward at qmul.ac.uk> on Wed, 04 Jul 2012 16:24:36 +0100:

> Dear list,
>
> I work on bats, some of which have deformed toes. I have just started
> trying to use MCMCglmm to investigate whether there is hertiable
> variation for the number of deformed toes a bat has. Since most bats
> have 0 deformed toes I am using a zero-inflated Poisson model.
>
> I have put together some code (see below) based on the MCMCglmm tutorial
> from the Wilson et al. 2011 'Ecologist's guide to the animal model'
> paper, the MCMCglmm Course Notes and trial and error. This code runs and
> I get a model and a heritability estimate. However, the heritability
> estimate I get is extremely high and this, teamed with the observations
> that my posterior distributions of the variance componants are not
> pretty as the example ones I've seen and my autocorrelation values are
> high (~0.4), makes me suspicious that I am not modelling this trait very
> well. I suspect I have over simplified the model. I based my prior on a
> model done in ASReml modelling 'Toes' as a Poisson distribution, which
> can an heritability estimate of 0.35.
>
> In addition, when I try and add a second random variable - year of birth
> (Born) - into a second model, both plots and auto correlation get worse
> and my heritability estimate for number of deformed toes (Toes) goes
> from about 0.8 to 0.05!
>
> Finally, I suspect you're not, but I would like to know if you are
> allowed to compare DIC values from models with different priors please.
>
> I have pasted my code below. If anyone can offer constructive criticism
> it would be much appreciated.
>
> Many thanks,
> Helen
>
> Model 1: With number of deformed toes (Toes) as the only random variable
>
>
> p.var<-var(data$Toes,na.rm=TRUE)
> prior1.1<-list(G=list(G1=list(V=matrix(p.var*0.35),n=1)),
> R=list(V=matrix(p.var*0.65),n=1))
>
> model1.1<-MCMCglmm(Toes~1,random=~animal,family="zipoisson",rcov=~trait:units,pedigree=ped,data=data,prior=prior1.1,nitt=500000,thin=500,burnin=200000,verbose=FALSE)
> posterior.heritability1.1<-model1.1$VCV[,"animal"]/(model1.1$VCV[,"animal"]+model1.1$VCV[,"trait:units"])
>
> HPDinterval(posterior.heritability1.1,0.95)
> posterior.mode(posterior.heritability1.1)
>
>
> Model 2: With number of toes (Toes) and year of birth (Born) as random
> variables
>
>
> p.var<-var(data$Toes,na.rm=TRUE)
> prior1.2<-list(G=list(G1=list(V=matrix(p.var/3),n=1),G2=list(V=matrix(p.var/3),n=1)),R=list(V=matrix(p.var/3),n=1))
>
> model1.2<-
> MCMCglmm(Toes~1,random=~animal+Born,family="zipoisson",rcov=~trait:units,pedigree=ped,data=data,prior=prior1.2,nitt=1000000,thin=1000,burnin=500000,verbose=FALSE)
> posterior.heritability1.2<-model1.2$VCV[,"animal"]/(model1.2$VCV[,"animal"]+model1.2$VCV[,"Born"]+model1.2$VCV[,"trait:units"])
>
> HPDinterval(posterior.heritability1.2,0.95)
> posterior.mode(posterior.heritability1.2)
>
>
>
>
>
>
>
>
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
>



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



More information about the R-sig-mixed-models mailing list