[R-sig-ME] Binomial model variance and repeatability estimates with MCMCglmm

Tue Oct 21 21:44:19 CEST 2014

List,
A colleague recently asked me some questions about issues with 
estimating repeatabilities for binomial/binary data. In trying to help 
out I started playing around with simulating some data and have been 
struggling to get decent effective sample sizes even when parameters are 
being well estimated. I suspect the problem is partly the priors 
(inverse-gamma) but have had issues with other priors too (priors as 
specified in MCMCglmm are still quite unclear to me in general, however).

While quite often repeatability gets estimated very well despite 
effective sample sizes of only a few dozen I also am getting poorly 
estimated repeatabilities (~0) with a population level repeatability of 
<.4 and sample sizes less than n=100 and reps=2, suggesting the 
inverse-gama being informative around zero. Is the poor mixing 
potentially due to using multinomial2 rather than categorical? I don't 
want to go the categorical route because repeatabilities are of 
particular interest.

I'm much less comfortable with generalized rather than standard linear 
models so I wonder if there are any glaring issues with the coding. Any 
thoughts on what might be going on?

Thanks for any help.
Ned

#make function to calculate inverse logit:
inv.log<- function(x){
   exp(x)/(1+exp(x))}

#specify some start conditions and bookkeeping stuff

#specify data distributions:
repeatability=0.3 #Change this for other repeatabilities (sampled 
population's repeatability, not sample's realized latent repeatability)
sigma2.e=6 #arbitrary, set repeatability above
sigma2.a=(repeatability*sigma2.e+repeatability*((pi^2)/3))/(1-repeatability)
B0=0 #could do different fixed effects with some more work...

#(n=number of individuals, i=number of replications per individual)
counter=0; n=150; i=4
ind.data=matrix(0,n*i,3) #make place to store data
tru.lat=matrix(0,n*i,2) #store sampled underlying scores

for(j in 1:n){
   alpha.j=rnorm(1,0,sigma2.a)
   for(rep in 1:i){
     counter=counter+1
     ind.data[counter,1]=j
     eij=rnorm(1,0,sigma2.e)
     pij=inv.log((B0+alpha.j+eij))
     ind.data[counter,3]=rbinom(1,1,pij)
     ind.data[counter,2]=1-ind.data[counter,3]
     tru.lat[counter,1]=alpha.j
     tru.lat[counter,2]=eij
   }}

ind.data=as.data.frame(ind.data)
names(ind.data)=c("Ind", "Fail", "Success")

prior1 = list(R = list(V = 1, nu=0.002),
               G = list(G1 = list(V = 1, nu = 0.002)))
sim.mcmc<-MCMCglmm(cbind(Fail,Success)~1, random= ~Ind,
                    family="multinomial2", prior=prior1,
                    nitt = 260000, thin = 200, burnin = 60000,
                    verbose=FALSE,data=ind.data)

#estimated repeatabilities
posterior.mode(sim.mcmc$VCV[,1]/(sim.mcmc$VCV[,1]+sim.mcmc$VCV[,2]+((pi^2)/3)))
HPDinterval(sim.mcmc$VCV[,1]/(sim.mcmc$VCV[,1]+sim.mcmc$VCV[,2]+((pi^2)/3)))
#effective sample sizes, tells you if model is mixing well
summary(sim.mcmc)[6]$Gcovariances[4]
summary(sim.mcmc)[8]$Rcovariances[4]

#sample repeatability, not sure this is quite right
tru.a2=var(tru.lat[1:(n/i)*i,1])
tru.e2=var(tru.lat[,2])
tru.a2/(tru.a2+tru.e2+((pi^2)/3))

-- 
Ned A. Dochtermann
Assistant Professor / Department of Biological Sciences
NORTH DAKOTA STATE UNIVERSITY
p: 701.231.7353 / f: 701.231.7149 / www.ndsu.edu

https://sites.google.com/site/neddochtermann/
ned.dochtermann at ndsu.edu