[R-sig-ME] MCMCglmm update (2.18)

Tue Mar 18 09:36:21 CET 2014

Hi,

I have updated MCMCglmm to version 2.18 and the source is on CRAN now.  
I would advise users to upgrade: there are a lot of minor changes and  
bug fixes but the main changes are:

BUGS:

DIC with Gaussian/non-Gaussian mixtures

DIC (and the deviance) was incorrectly calculated in multivariate  
models with a mixture of Gaussian and non-Gaussian responses when  
residual correlations existed between the two types of response. I'm  
sorry that this bug has persisted so long - I don't use DIC because I  
think it is not focused at the right level for most scientific  
inference, particularly with non-Gaussian data as here. I vacillate  
between getting rid of it from MCMCglmm (because when I see it  
reported, its nearly always inappropriately) and retaining it (because  
sometimes it might be useful). If people have strong opinions let me  
know.

Gelman's Prior

The function gelman.prior generates a prior correlation matrix for the  
fixed effects had the predictors been transformed according to  
procedures outlined in Gelman et al's "A weakly informative default  
prior distribution for logistic and other regression models" and an  
i.i.d prior placed on the resulting coefficients. The documentation  
for the argument "scale" in gelman.prior says it is `the prior  
variance for regression parameters' (i.e. the prior covariance matrix  
is the prior correlation multiplied by this number). In version 2.17  
this should have read `the prior standard deviation' rather than  
`prior variance'. In version 2.18 it reads the `prior standard  
deviation' and this is now correct.

UPDATES:

Block Diagonal R-structures

Block Diagonal R-structures are now allowed. For example imagine a  
bivariate model where pairs of observations are made on individuals of  
different sex.  There may be a need to fit different 2x2 residual  
covariance matrices for the two sexes.   rcov=~us(trait:sex):units  
would fit a 4x4 covariance matrix, but the between-sex residual  
covariances would be estimated despite not being identifiable (no  
individual can be both sexes).  Now, models of the form  
rcov=~us(trait:at(sex, "M")):units+us(trait:at(sex, "F")):units can be  
fitted that allow the non-identified covariances to be effectively set  
to zero.

Probit Models

For categorical (ordered or not) data, MCMCglmm uses a redundant  
parameteristaion whereby the non-identified `residuals' are retained  
and their variance (usually) fixed in the prior.  For probit models it  
turns out that the redundant parameteristaion is not necessary and the  
MCMC algorithm will mix without it (as opposed to other models). In  
order to avoid confusion there is a new family called "threshold" (an  
old name for a probit model from genetics). It is identical to  
"ordinal" except the residual variance now refers to the variance  of  
the link function (set to one in a standard probit analysis) rather  
than the variance of the non-identified `residuals'.  Assuming R is  
fixed to one in the prior for both the "threshold" and "ordinal"  
models  then if the ordinal model location effects are divided by  
sqrt(2) and the variance components by 2 then they should be  
equivalent to those in the "threshold model". I have provided an  
example at the end of the email.

People may wonder what the point is given they are identical. There  
are three advantages.

1) Mixing is generally better because everything, including the latent  
variables, can be Gibbs sampled.
2) DIC can be focused on the linear predictor (i.e. Xb+Zu) rather than  
the latent variable (i.e. Xb+Zu+e) as with Gaussian responses (but  
unlike other non-Gaussian responses).
3) Residual correlations are less constrained in multivariate models.  
Imagine two binary response variables analysed in an ordinal model  
with rcov=~corg(trait):units, and have r as the estimated residual  
correlation. The estimated residual latent-scale correlation is   
r/sqrt(1+1)*sqrt(1+1) = r/2 and so the residual latent-scale  
correlation is constrained to lie between -0.5 and 0.5. Having  
family="threshold" the residual latent-scale correlation is r. Please  
note that if |r| is large then numerical problems can occur (it is  
equivalent to the extreme category problems sometimes seen in  
univariate models). This should be evident because the posterior for r  
will get trapped close to 1 or -1.

Cheers,

Jarrod

# Ordinal versus Threshold
set.seed(1000)
x<-rnorm(300)
id.e<-rnorm(150, 0, 1)
id<-gl(150,2)
y<-rbinom(300, 1, pnorm(0.5-x+id.e[id]))

# 150 individuals measured twice with link-scale repeatability of 0.5.
# intercept is 0.5, and the regression coefficient associated with x is -1

data<-data.frame(y=y, x=x, id=id)

prior<-list(R=list(V=1, fix=1), G=list(G1=list(V=1, nu=0)))

m1<-MCMCglmm(y~x, random=~id, family="threshold", data=data, prior=prior)
m2<-MCMCglmm(y~x, random=~id, family="ordinal", data=data, slice=TRUE,  
prior=prior)

plot(mcmc.list(m2$VCV[,1]/2, m1$VCV[,1]))
plot(mcmc.list(m2$Sol/sqrt(2), m1$Sol))

# note that the change of scale means the prior influence is different  
between the two models and so really V in the prior for G and B (the  
fixed effects) should be divided by two to make them exact.

# Alternatively:

prior.2<-list(R=list(V=2, fix=1), G=list(G1=list(V=1, nu=0)))

m3<-MCMCglmm(y~x, random=~id, family="threshold", data=data, prior=prior.2)

plot(mcmc.list(m2$VCV[,1], m3$VCV[,1]))
plot(mcmc.list(m2$Sol, m3$Sol))

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.