[R-sig-ME] MCMCglmm predictions with new data, marginalized with respect to random effects
Alberto Gallano
alberto.gc8 at gmail.com
Thu May 28 03:29:41 CEST 2015
I'm trying to calculate predictions for new data (with confidence
intervals) that are marginalized with respect to the random effects. I've
been able to get part of the way there thanks to the suggestions of Ben
Bolker in this post:
http://stackoverflow.com/questions/21057548/how-can-one-simulate-quantities-of-interest-from-the-posterior-density-in-mcmcgl
I have two questions:
1) how can I marginalize with respect to the random effects when using new
data?
(In other words, how should the information in the "temp" variable below be
incorporated into the "pred_frame" object?. Here, "temp" is a vector of
length 212 (106 random effects for each random term in the model formula -
I have 106 species in the dataset). Pages 46-47 of the course notes suggest
that the variance components should be summed (sigma^2) and then added to
the linear predictor to get the expectation of the outcome variable, but
I'm not sure how to do this with new data. Do I need to add the random
effects to the "pred_frame" object?)
2) how can I calculate confidence intervals (or standard errors) for the
predictions?
(ideally while marginalizing with respect to the random effects, but if
that's not possible, just using the fixed effects).
#--------------------------------------------------------------------
prior1 <- list(
B = list(mu = rep(0, 5), V = diag(1e+08, 5)),
G = list(G1 = list(V = 1, nu = 1, alpha.mu = 0, alpha.V = 1000),
G2 = list(V = 1, nu = 1, alpha.mu = 0, alpha.V = 1000)),
R = list(V = 1, nu = 0.002)
)
set.seed(1234)
fit1 <- MCMCglmm(
fixed = I1 ~ I2.species.mean + I2.within.species +
body.mass.species.mean + body.mass.within.species,
random = ~ phylo + species,
rcov = ~ units,
data = incisor.dat,
family = "gaussian",
ginverse = list(phylo = inv_phylo_mat$Ainv),
prior = prior1,
pr = TRUE,
pl = TRUE,
nitt = 1.1e+7, thin = 2000, burnin = 5e+5,
verbose = FALSE
)
# prediction data frame
pred_frame <- expand.grid(
I2.species.mean = seq(5, 12, by = 0.05),
I2.within.species = mean(incisor.dat$I2.within.species),
body.mass.species.mean = seq(1000, 5000, by = 100),
body.mass.within.species = mean(incisor.dat$body.mass.within.species)
)
# fixed effects design matrix
X <- model.matrix(~ I2.species.mean + I2.within.species +
body.mass.species.mean + body.mass.within.species,
data = pred_frame)
# multiply the fixed effects design matrix (X) by the chains of
coefficients
# stored in the Sol ("solution") component of the MCMCglmm object
pred_frame$I1 <- X %*% t(fit1$Sol[, colnames(X)])
# random effects design matrix (with intercept suppressed)
Z <- model.matrix(~ phylo + taxon - 1, data = incisor.dat)
# multiply the random effects design matrix (Z) by the $Liab component
(posterior distributions of latent variables)
# (note: set pl = TRUE when running the model to get it to save the latent
effects)
temp <- colMeans(fit1$Liab %*% Z)
sigma2 <- sum(temp)
#--------------------------------------------------------------------
best,
Alberto
[[alternative HTML version deleted]]
More information about the R-sig-mixed-models
mailing list