[R-sig-ME] Combining MCMCglmm estimates

Mon Oct 8 11:45:55 CEST 2012

Hi Davina,

I haven't actually merged runs, so the following isn't based on experience. I'm also not aware of the methods for combining SEs from different imputed data sets that you mention. However I have the feeling that you don't need them if you have MCMC output from each data set.

Leaving aside imputation for the moment...

Let's say you've run the same model 10 times from the same data set, giving 10 sets of MCMC output, where each output is a sample from the joint posterior distribution of the model parameters. If these are "good" samples from the posterior, then you can combine them and treat them as a single MCMC sample. By "good" I mean they have started from different starting value sets and burned in for long enough to forget these values, they are large enough (e.g. >=1000 independent samples), and they have converged (you can check this visually by plotting the chains over each other - see plot(mcmc.list(...)) in the coda package). This is a neater, more flexible and presumably more accurate way of combining SEs - you then have a direct sample from the posterior of the SEs (and all the other other parameters). (Strictly they don't need to be large and may not need to be independent but I think it's easier to assess their validity if they are.)

The next question is whether it's valid to do this with runs that come from data sets that differ due to imputation uncertainty. My intuitive feeling is that it would be. The imputed predicted data values are effectively parameters with distributions, so merging across 10 samples would be a natural way of importing the uncertainty due to sampling from this distribution into the final MCMC output. You'd ideally want more than 10 samples, especially if the missing values have a large impact, but I guess computational capacity is limiting.

Ideally you'd want to do the imputation and the inference in a single model, but I don't think this is possible in MCMCglmm.

I hope this is helpful, but note I'm not an expert in this stuff - you're likely to get much better-informed replies from this list.

Best wishes,
Paul

-----Original Message-----
From: r-sig-mixed-models-bounces at r-project.org [mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf Of Davina Hill
Sent: 02 October 2012 09:06
To: r-sig-mixed-models at r-project.org
Subject: [R-sig-ME] Combining MCMCglmm estimates

Dear list members,

I am using MCMCglmm to investigate the effects of an animal's body mass, age and group size on its reproductive tactic (a factor with four levels), with individual identity and year in the model as random factors.  Some values of body mass and age were missing, so I used Amelia to impute ten datasets and then ran the following model on each dataset separately:

k <- length(levels(ds1$tactic))
I <- diag(k-1)
J <- matrix(rep(1, (k-1)^2), c(k-1, k-1)) IJ <- (1/4) * (I+J) prior = list(R = list(V = IJ, n=4, fix=1),G = list(G1 = list(V = IJ,n = 4), G2 = list(V = IJ,n=4)))

mousemodel <- MCMCglmm(tactic ~ trait-1+mass+age+groupsize, random = ~idh(trait):id + idh(trait):year, rcov = ~idh(trait):units, family = "categorical", burnin = 2000000, nitt = 3000000, thin = 10, data = ds1, prior = prior, verbose = FALSE)

I would like to combine the ten sets of MCMCglmm estimates to get a single inference.  The most promising solution I've found so far is the mi.inference function in the mix package, which uses Rubin's rule to combine estimates and SEs from a number of datasets.  I don't know if the rule applies to MCMC objects and I was wondering if someone could advise me whether what I've done is appropriate.  I extracted the posterior mean and Time-series SE from summary(mousemodel$Sol) for each dataset and ran:

  mIest<-as.data.frame(mi.inference(est,std.err,confidence=0.95))

I used 'Time-series' SE but I'm unclear what the difference is between this and 'Naïve' SE and couldn't find a definition in the MCMCglmm literature.  I'd also like to obtain a single set of HPD intervals if this is possible.

Any help would be much appreciated.

Kind regards,
Davina Hill
<html><p><font face = "verdana" size = "0.8" color = "navy">This communication is intended for the addressee only. It is confidential. If you have received this communication in error, please notify us immediately and destroy the original message. You may not copy or disseminate this communication without the permission of the University. Only authorized signatories are competent to enter into agreements on behalf of the University and recipients are thus advised that the content of this message may not be legally binding on the University and may contain the personal views and opinions of the author, which are not necessarily the views and opinions of The University of the Witwatersrand, Johannesburg. All agreements between the University and outsiders are subject to South African Law unless the University agrees in writing to the contrary.</font></p></html>

_______________________________________________
R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

The University of Glasgow, charity number SC004401