[R-sig-ME] Question about the predict function in MCMCglmm
Joelle Mbatchou
jmbatchou at uchicago.edu
Tue Nov 1 19:27:03 CET 2016
Hi all,
I am a confused about what the predict function from MCMCglmm actually does to get the "predicted probabilities" in the context of a logistic mixed model.
More precisely, I have a binary response Y and logit(E(Y|u)) = L = Xb + u
where X is a matrix of covariates and u~N(0, sigma_u^2 * V) where V is known.
Since the response is binary, the residual term is fixed at 1. I'd fit this model in R as:
idU <- 1:length(Y)
prior1 <- list(R = list(V = 1, fix=1), G = list(G1 = list(V = 1, nu = 1, alpha.mu=0, alpha.V=5^2)))
mMCMC <- MCMCglmm(Y ~ 1 + Z, random =~idU, family = "categorical", ginverse = list(idU = inverse_V), prior=prior1, pr = TRUE, verbose = F)
Also in mMCMC$Sol and mMCMC$VCV, the samples from the joint posterior distribution of p(b,u,sigma_u^2| y) are stored? And so to approximate the marginal posterior distributions (i.e. p(b|y), p(u|y) and p(sigma_u^2|y)) we could just plot histograms of each quantity separately (for example plot(mMCMC$VCV[,1]) for sigma_u^2)?
So in the context of getting posterior predictions for the probability that Y=1 with the predict function, I am confused about what quantities are given by:
pred1 <- predict(mMCMC, marginal = ~idU, type = "response")
vs.
pred2 <- predict(mMCMC, marginal = NULL, posterior = "mean", type = "response")
More specifically, I don't understand what they represent (and how they are obtained). I thought for pred1 'Xb/sqrt(1+k^2*sigma^2)' was computed for each MCMC iteration (eqn 2.14 in CourseNotes) to approximate taking the expectation of E(Y|b, u, sigma^2) over the random effects u and then the average was taken over the MCMC iterations for b and sigma^2; however, when I do this 'by hand', I get different results. Also for pred2, I thought it was giving the predicted probabilities P(Y=1|b,u,sigma^2) so
> beta_hat <- as.matrix(mMCMC$Sol[,1:2])
> X <- as.matrix(mMCMC$X)
> k <- ((16*sqrt(3))/(15*pi))^2
> pred1_hand <- colMeans(plogis(tcrossprod(beta_hat, X) / sqrt(1 + k^2 * rowSums(mMCMC$VCV))))
> U <- as.matrix(mMCMC$Sol[,-(1:2)])
> pred2_hand <- colMeans(plogis(tcrossprod(beta_hat, X) + U))
> head(cbind(pred1, pred2, pred1_hand, pred2_hand)) pred1_hand pred2_hand
1 0.08467802 0.21552374 0.03564130 0.1448464
2 0.23766175 0.17228113 0.15399697 0.2798503
3 0.19338132 0.07845114 0.11215077 0.4200217
4 0.15685811 0.10811526 0.08232281 0.1037176
5 0.14048409 0.35230597 0.07032301 0.2009921
6 0.17927898 0.09011039 0.10013013 0.1568070
Any clarification on this would be greatly appreciated. Thank you!
Cheers,
Joelle
[[alternative HTML version deleted]]
More information about the R-sig-mixed-models
mailing list