[R-sig-ME] MCMCglmm update (2.18)
Jarrod Hadfield
j.hadfield at ed.ac.uk
Tue Mar 18 09:36:21 CET 2014
Hi,
I have updated MCMCglmm to version 2.18 and the source is on CRAN now.
I would advise users to upgrade: there are a lot of minor changes and
bug fixes but the main changes are:
BUGS:
DIC with Gaussian/non-Gaussian mixtures
DIC (and the deviance) was incorrectly calculated in multivariate
models with a mixture of Gaussian and non-Gaussian responses when
residual correlations existed between the two types of response. I'm
sorry that this bug has persisted so long - I don't use DIC because I
think it is not focused at the right level for most scientific
inference, particularly with non-Gaussian data as here. I vacillate
between getting rid of it from MCMCglmm (because when I see it
reported, its nearly always inappropriately) and retaining it (because
sometimes it might be useful). If people have strong opinions let me
know.
Gelman's Prior
The function gelman.prior generates a prior correlation matrix for the
fixed effects had the predictors been transformed according to
procedures outlined in Gelman et al's "A weakly informative default
prior distribution for logistic and other regression models" and an
i.i.d prior placed on the resulting coefficients. The documentation
for the argument "scale" in gelman.prior says it is `the prior
variance for regression parameters' (i.e. the prior covariance matrix
is the prior correlation multiplied by this number). In version 2.17
this should have read `the prior standard deviation' rather than
`prior variance'. In version 2.18 it reads the `prior standard
deviation' and this is now correct.
UPDATES:
Block Diagonal R-structures
Block Diagonal R-structures are now allowed. For example imagine a
bivariate model where pairs of observations are made on individuals of
different sex. There may be a need to fit different 2x2 residual
covariance matrices for the two sexes. rcov=~us(trait:sex):units
would fit a 4x4 covariance matrix, but the between-sex residual
covariances would be estimated despite not being identifiable (no
individual can be both sexes). Now, models of the form
rcov=~us(trait:at(sex, "M")):units+us(trait:at(sex, "F")):units can be
fitted that allow the non-identified covariances to be effectively set
to zero.
Probit Models
For categorical (ordered or not) data, MCMCglmm uses a redundant
parameteristaion whereby the non-identified `residuals' are retained
and their variance (usually) fixed in the prior. For probit models it
turns out that the redundant parameteristaion is not necessary and the
MCMC algorithm will mix without it (as opposed to other models). In
order to avoid confusion there is a new family called "threshold" (an
old name for a probit model from genetics). It is identical to
"ordinal" except the residual variance now refers to the variance of
the link function (set to one in a standard probit analysis) rather
than the variance of the non-identified `residuals'. Assuming R is
fixed to one in the prior for both the "threshold" and "ordinal"
models then if the ordinal model location effects are divided by
sqrt(2) and the variance components by 2 then they should be
equivalent to those in the "threshold model". I have provided an
example at the end of the email.
People may wonder what the point is given they are identical. There
are three advantages.
1) Mixing is generally better because everything, including the latent
variables, can be Gibbs sampled.
2) DIC can be focused on the linear predictor (i.e. Xb+Zu) rather than
the latent variable (i.e. Xb+Zu+e) as with Gaussian responses (but
unlike other non-Gaussian responses).
3) Residual correlations are less constrained in multivariate models.
Imagine two binary response variables analysed in an ordinal model
with rcov=~corg(trait):units, and have r as the estimated residual
correlation. The estimated residual latent-scale correlation is
r/sqrt(1+1)*sqrt(1+1) = r/2 and so the residual latent-scale
correlation is constrained to lie between -0.5 and 0.5. Having
family="threshold" the residual latent-scale correlation is r. Please
note that if |r| is large then numerical problems can occur (it is
equivalent to the extreme category problems sometimes seen in
univariate models). This should be evident because the posterior for r
will get trapped close to 1 or -1.
Cheers,
Jarrod
# Ordinal versus Threshold
set.seed(1000)
x<-rnorm(300)
id.e<-rnorm(150, 0, 1)
id<-gl(150,2)
y<-rbinom(300, 1, pnorm(0.5-x+id.e[id]))
# 150 individuals measured twice with link-scale repeatability of 0.5.
# intercept is 0.5, and the regression coefficient associated with x is -1
data<-data.frame(y=y, x=x, id=id)
prior<-list(R=list(V=1, fix=1), G=list(G1=list(V=1, nu=0)))
m1<-MCMCglmm(y~x, random=~id, family="threshold", data=data, prior=prior)
m2<-MCMCglmm(y~x, random=~id, family="ordinal", data=data, slice=TRUE,
prior=prior)
plot(mcmc.list(m2$VCV[,1]/2, m1$VCV[,1]))
plot(mcmc.list(m2$Sol/sqrt(2), m1$Sol))
# note that the change of scale means the prior influence is different
between the two models and so really V in the prior for G and B (the
fixed effects) should be divided by two to make them exact.
# Alternatively:
prior.2<-list(R=list(V=2, fix=1), G=list(G1=list(V=1, nu=0)))
m3<-MCMCglmm(y~x, random=~id, family="threshold", data=data, prior=prior.2)
plot(mcmc.list(m2$VCV[,1], m3$VCV[,1]))
plot(mcmc.list(m2$Sol, m3$Sol))
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
More information about the R-sig-mixed-models
mailing list