[R-sig-ME] MCMCglmm predictions with new data, marginalized with respect to random effects

Lenth, Russell V russell-lenth at uiowa.edu
Sat May 30 23:17:52 CEST 2015


You can streamline the process you describe somewhat using the lsmeans package; For your example, it'd be something like:

# -----------------------------------------
  library(lsmeans)

  fit1.rg <- ref.grid(fit1, 
    at = list(I2.species.mean = seq(5, 12, by = 0.05),
      body.mass.species.mean = seq(1000, 5000, by = 100)),
    data = incisor.dat)

  # frequentist summary -averages and SEs based on sample of fixed effects
  summary(fit1.rg)

  # posterior sample of predictions at each point in the grid
  #   for further analysis using the coda package
  #   (this will be too much, but could be useful for a coarser grid)
  fit1.mcmc <- as.mcmc(fit1.rg)

#--------------------------------------------

Note in the ref.grid call, it was not necessary to specify the value of I2.within.species or body.mass.within.species, because the mean will be used by default. See ?ref.grid, ?"lsmeans-package", and ?models for more information.

This does not address the part about incorporating the latent temp effects. Perhaps some others' answers could provide insights. 
Russ

PS - is the incisor.dat dataset publicly available somewhere?

Russell V. Lenth  -  Professor Emeritus
Department of Statistics and Actuarial Science   
The University of Iowa  -  Iowa City, IA 52242  USA   
Voice (319)335-0712 (Dept. office)  -  FAX (319)335-3017



-----Original Message-----

Date: Wed, 27 May 2015 21:29:41 -0400
From: Alberto Gallano <alberto.gc8 at gmail.com>
To: r-sig-mixed-models at r-project.org
Subject: [R-sig-ME] MCMCglmm predictions with new data, marginalized
	with respect to random effects


I'm trying to calculate predictions for new data (with confidence
intervals) that are marginalized with respect to the random effects. I've
been able to get part of the way there thanks to the suggestions of Ben
Bolker in this post:

http://stackoverflow.com/questions/21057548/how-can-one-simulate-quantities-of-interest-from-the-posterior-density-in-mcmcgl


I have two questions:


1) how can I marginalize with respect to the random effects when using new
data?

(In other words, how should the information in the "temp" variable below be
incorporated into the "pred_frame" object?. Here, "temp" is a vector of
length 212 (106 random effects for each random term in the model formula -
I have 106 species in the dataset). Pages 46-47 of the course notes suggest
that the variance components should be summed (sigma^2) and then added to
the linear predictor to get the expectation of the outcome variable, but
I'm not sure how to do this with new data. Do I need to add the random
effects to the "pred_frame" object?)


2) how can I calculate confidence intervals (or standard errors) for the
predictions?

(ideally while marginalizing with respect to the random effects, but if
that's not possible, just using the fixed effects).



#--------------------------------------------------------------------
prior1 <- list(
    B = list(mu = rep(0, 5), V = diag(1e+08, 5)),
    G = list(G1 = list(V = 1, nu = 1, alpha.mu = 0, alpha.V = 1000),
             G2 = list(V = 1, nu = 1, alpha.mu = 0, alpha.V = 1000)),
    R = list(V = 1, nu = 0.002)
)

set.seed(1234)
fit1 <- MCMCglmm(
    fixed = I1 ~ I2.species.mean + I2.within.species +
body.mass.species.mean + body.mass.within.species,
    random = ~ phylo + species,
    rcov = ~ units,
    data = incisor.dat,
    family = "gaussian",
    ginverse = list(phylo = inv_phylo_mat$Ainv),
    prior = prior1,
    pr = TRUE,
    pl = TRUE,
    nitt = 1.1e+7, thin = 2000, burnin = 5e+5,
    verbose = FALSE
    )

# prediction data frame
pred_frame <- expand.grid(
    I2.species.mean = seq(5, 12, by = 0.05),
    I2.within.species = mean(incisor.dat$I2.within.species),
    body.mass.species.mean = seq(1000, 5000, by = 100),
    body.mass.within.species = mean(incisor.dat$body.mass.within.species)
    )

# fixed effects design matrix
X <- model.matrix(~ I2.species.mean + I2.within.species +
                    body.mass.species.mean + body.mass.within.species,
                    data = pred_frame)

# multiply the fixed effects design matrix (X) by the chains of
coefficients
# stored in the Sol ("solution") component of the MCMCglmm object
pred_frame$I1 <- X %*% t(fit1$Sol[, colnames(X)])

# random effects design matrix (with intercept suppressed)
Z <- model.matrix(~ phylo + taxon - 1, data = incisor.dat)

# multiply the random effects design matrix (Z) by the $Liab component
(posterior distributions of latent variables)
# (note: set pl = TRUE when running the model to get it to save the latent
effects)
temp <- colMeans(fit1$Liab %*% Z)
sigma2 <- sum(temp)
#--------------------------------------------------------------------


best,
Alberto



More information about the R-sig-mixed-models mailing list