[R-sig-ME] zero-inflated poisson in MCMCglmm: error in if (any(rterms == "animal"))

Fri Mar 11 01:02:30 CET 2011

Dear mailing list,

this message has not been posted yet but in the meantime I already got  
some helpful feedback from Jarrod Hadfield that we would like to share  
(see attached messages)

Quoting Jarrod Hadfield <j.hadfield at ed.ac.uk>:

Dear Adrian,

You have used idh(trait): for all random terms which essentially fixes  
the covariance between the zero-inflation and the Poisson part to  
zero. This is fine, and in fact at the residual-level the covariance  
cannot be estimated anyway and so you might as well set it to zero.

In this instance your prior on each variance is equivalent to V=1 and  
nu=2 which is strong. Generally something along the lines of nu=0.002  
is taken to be weak, or perhaps even weaker would be a parameter  
expanded prior with large scale (as this tends to a uniform prior for  
the standard deviations).

In the earlier post you had fix=2 in the residual. This makes a lot of  
sense because it fixes the second diagonal element of the residual  
covariance matrix to its value in V (i.e. 1). The reason that this  
makes sense is that this residual variance is associated with the  
zero-inflation and like with binary data the residual variation  
(over-dispersion) cannot be estimated. Of course this is a Bayesian  
analysis so all is not lost as long as you put a proper prior on the  
variance which you have (V=1, nu=2). However, in a case where all of  
the information comes from the prior I think it is best just to fix it  
at some value - it saves on computing time, it lets your colleagues  
known you know what you're doing, and in this case it probably limits  
numerical problems as the logit/probit function under/overflows.

You may also want to consider zapoisson models - in many ways they  
have a better interpretation (and they often mix better)

In general choosing a prior based on the mixing of the chain is a bad  
idea. Unless there is strong conflict between prior and data,  
increasing the informativeness of the prior will *usually* improve  
mixing.

Hope this helps,

Jarrod

Quoting Adrian Jaeggi <ajaeggi at anth.ucsb.edu>:

Dear Jarrod,

thanks so much for your response, I really appreciate you taking your  
time to answer these list mails! I eventually did notice the syntax  
error but it was already too late to delete the post..

About your other remark on the prior, I noticed that this can be a bit  
tricky for someone not that familiar with Baeysian statistics.. I am  
trying to understand the prior specifications from your paper and  
course notes but to be honest I'm not 100% sure what would be the  
proper settings in my case. I just played around a bit with the n= and  
fix= arguments as well as us vs. idh and the code that yielded the  
highest eff. samp and best converging traces was the one I posted  
before:

prior <- list(R=list(V=diag(2),n=2), G=list(G1=list(V=diag(2), n=2),  
G2=list(V=diag(2), n=2)))

model<- MCMCglmm(AnnualIncome~trait+trait:Grade-1,
         random=~idh(trait):data.Community+idh(trait):data.Region,
         rcov=~idh(trait):units, prior=prior, pl=TRUE,  
family="zipoisson", data=mydata)

I hope it makes sense to choose the prior settings based on these criteria.

I really like your package as I've been using glmm's for quite a while  
now and yours is the first package where I don't have any doubts about  
the reliability of the estimates!

Thanks again and best regards,
Adrian

Quoting Jarrod Hadfield <j.hadfield at ed.ac.uk>:

Hi Adrian,

rcov~idh(trait):units should read rcov=~idh(trait):units

Note that your priors may be quite informative with nu=2.

Cheers,

Jarrod

On 9 Mar 2011, at 17:18, Adrian Jaeggi wrote:

Dear all,

I'm having some troubles running zero-inflated poisson models in  
MCMCglmm and after reading the MCMCglmm Course Notes and the Hadfield  
2010 J Stat Software paper and discussing with colleagues I still  
couldn't figure out the problem so any help here would be greatly  
appreciated!

I am analyzing data from an indigenous population in Bolivia, and in  
this example I'm testing the influence of a person's highest school  
grade on their annual monetary income. Because access to wage labor is  
very limited and recent in this remote population, most people score 0  
on income which is why I want to use zero-inflated poisson. I also  
want to include sampling location (community and region) as random  
effects.

Following a previous thread on this list  
(https://stat.ethz.ch/pipermail/r-sig-mixed-models/2009q2/002283.html)  
I used the following code:

   priors <- list(R=list(V=diag(2),n=2, fix=2),  
G=list(G1=list(V=diag(2), n=2), G2=list(V=diag(2), n=2)))

   model<- MCMCglmm(AnnualIncome~trait+trait:grade-1,
   random=~idh(trait):data.Community+idh(trait):data.Region,
   rcov~idh(trait):units, prior=priors, pl=TRUE, family="zipoisson",
   data=mydata)

Running this code returns the following error message (roughly  
translated from German):
   error in if (any(rterms == "animal")) { :
    missing value where TRUE/FALSE is necessary

I have no idea what this error message refers to. I understand the  
"animal" comes from the blue tits example in the package but I don't  
see what it refers to here and why this evaluation fails to return  
TRUE or FALSE.

Thanks so much for any help with this!
Best regards,
Adrian

-- 
Adrian Jaeggi, PhD
Postdoctoral Researcher
Human Behavioral Ecology Lab
Department of Anthropology
University of California Santa Barbara
Santa Barbara, CA 93106-3210
Phone: 805-455-8587