[R-sig-ME] MCMCglmm predictions with new data for multi-response model

Thu Jul 9 01:03:10 CEST 2015

Thanks so much for your response. That is encouraging, and very helpful.
It's nice every once in a while to find out things are simpler than you
imagined.

Two quick notes for anyone who finds this discussion while googling about:
• I let MCMCglmm do the melting and formatting of my (new data) design
matrix (which could have been tricky, being multivariate). I generated my
prediction frame in standard 'wide' format, and then used it as an input
for a dummy model identical to my original model, and then pulled out X
from the dummy model.

• I was getting the non-conformable arguments error for a while, when I
tried to multiple t(Sol) by the design matrix. It took me a while to figure
out that I had to re-run my original model making sure pr=FALSE, as
retaining the random effects changes the structure of Sol.

Thanks again, Malcom!

~rafter

Rafter Sass Ferguson, MS
PhD Candidate | Crop Sciences Department
University of Illinois in Urbana-Champaign
liberationecology.org
518 567 7407

On Tue, Jul 7, 2015 at 7:07 AM, Malcolm Fairbrother <
M.Fairbrother at bristol.ac.uk> wrote:

> Hi Rafter,
>
> (a) You're on the right track.
>
> (b) The design matrix is available at m2b$X. If you take a look, you'll
> see how it's formatted. Something like temp <- as.matrix(m2b$X) %*%
> t(m2b$Sol) should be what you want, just replacing "as.matrix(m2b$X)" with
> your new data, in the same format.
>
> (c.1) Because your outcomes are Gaussian, you don't need to do anything
> with the random effects to generate predictions that are marginalized over
> the random effects. That's only an issue for other sorts of outcomes
> (binary, categorical, etc.). So your steps 5 and 6 are unnecessary. Your
> "temp" is already your "pred_mat".
>
> (c.2) I believe you're looking for apply(temp, 1, function(x)
> HPDinterval(as.mcmc(x))).
>
> Cheers,
> Malcolm
>
>
>
> Date: Mon, 6 Jul 2015 16:54:52 -0500
>> From: rafter sass ferguson <liberationecology at gmail.com>
>> To: r-sig-mixed-models at r-project.org
>> Subject: [R-sig-ME] MCMCglmm predictions with new data for
>>         multi-response  model
>>
>>
>> Dear list,
>>
>> I am hoping to generate predictions (with CIs) based on new data for a
>> model with
>> 3 Gaussian response variables and 1 blocking variable.
>>
>> A very similar and recent question hasn't received a substantive reply
>> yet,
>> here:
>> https://stat.ethz.ch/pipermail/r-sig-mixed-models/2015q2/023616.html .
>> Jarrod mentioned on this list some months ago that a version of MCMCglmm
>> that can predict with new data is imminent, but I haven't seen any signs
>> of
>> it -
>> https://stat.ethz.ch/pipermail/r-sig-mixed-models/2015q1/023329.html .
>>
>> I believe I understand the process as discussed here -
>> https://stat.ethz.ch/pipermail/r-sig-mixed-models/2013q1/020065.html
>>
>> http://stackoverflow.com/questions/21057548/how-can-one-simulate-quantities-of-interest-from-the-posterior-density-in-mcmcgl
>> ,
>> http://glmm.wikidot.com/faq
>>
>> Here is my working understanding of the process:
>> (1) Generate new data / prediction frame -> pred_frame
>> (2) Generate fixed effects design matrix -> fix_dm
>> (3) Extract chains of coefficients from Sol -> beta
>> (4) Matrix-multiply  fix_dm * beta -> temp
>> (5) Extract variance-covariance matrix as rowsums of mod$VCV -> V
>> (6) Divide temp by sqrt(V) to get a matrix of predictions [i, j] with i =
>> MCMC sample and j= new data point -> pred_mat
>> (7) Generate CIs for predictions as HPDs of columns of pred_mat
>>
>> Here are my questions:
>> a. Most broadly - am I on the right track, or have I totally misunderstood
>> something critical?
>>
>> b. For (4) above - the structure of model$Sol in the multi-response model
>> is complex and doesn't conform to the design matrix. I figure I need to
>> melt
>> the design matrix but I'm unsure how to ensure I'm doing that in just the
>> same way as MCMCglmm.
>>
>> c. Some of the same unanswered questions discussed here:
>> https://stat.ethz.ch/pipermail/r-sig-mixed-models/2015q2/023616.html  ,
>> - I suspect based on some of the discussions linked to above that the
>> procedure I outlined produces results that are marginalized over random
>> effects, but I'm not at all sure.
>> - What's the most appropriate way to generate CIs for predictions? If I'm
>> not wrong about (7), what's the best way to calculate those HPDs?
>>
>> Here is the model I'm working with:
>>
>> prior2 <- list(R=list(V=diag(3),nu=3), G=list(G1=list(V=diag(3),
>> nu=3.02, alpha.mu=rep(0,3),
>> alpha.V=1000*diag(3)), G2=list(V=diag(3), nu=3.02, alpha.mu=rep(0,3),
>> alpha.V=1000*diag(3) ) ) )
>>
>> m2b <- MCMCglmm(cbind(professional, relational, practice) ~
>>                  -1 + trait +
>>                  trait:income + trait:age + trait:ethnicity +
>> trait:gender + trait:residence + trait:education + trait:HDI +
>> trait:Ineq + trait:Enviro + trait:Enviro:gender + trait:Ineq:gender +
>> trait:Ineq:ethnicity + trait:Ineq:income,
>>                random= ~us(trait):block + us(trait):ID, rcov=
>> ~us(trait):units,
>>                family=rep("gaussian",3),
>>                prior=prior2,
>>                nitt <- 80000, thin <- 25, burnin <- 50000,
>>                data=df_sel,
>>                verbose=TRUE)
>>
>> Thank you immensely much for any help!
>>
>> Warmly,
>>
>> Rafter
>>
>

	[[alternative HTML version deleted]]