[R-sig-ME] Combining MCMCglmm estimates

Wed Oct 10 12:18:50 CEST 2012

On Mon, Oct 08, 2012 at 10:45:55AM +0100, Paul Johnson wrote:
> Hi Davina,
> 
> I haven't actually merged runs, so the following isn't based on experience. I'm also not aware of the methods for combining SEs from different imputed data sets that you mention. However I have the feeling that you don't need them if you have MCMC output from each data set.
> 
> Leaving aside imputation for the moment...
> 
> Let's say you've run the same model 10 times from the same data set, giving 10 sets of MCMC output, where each output is a sample from the joint posterior distribution of the model parameters. If these are "good" samples from the posterior, then you can combine them and treat them as a single MCMC sample. By "good" I mean they have started from different starting value sets and burned in for long enough to forget these values, they are large enough (e.g. >=1000 independent samples), and they have converged (you can check this visually by plotting the chains over each other - see plot(mcmc.list(...)) in the coda package). 

Why is different starting values important? Shouldn't burning make the 10 chains independent enough?

The reason I'm asking is because I want to know what you _have to_ do
in order to use chains that originates from running MCMC on clusters
or multicore processors.

Different starting points is not very hard to provide, but it would be
good to know if they are important or not.

The idea that different starting points are needed would, if I
understand the rationale correctly, imply that the chains are better
in the end than in the beginning. Is that the point?

-- 
Hans Ekbrand (http://sociologi.cjb.net) <hans at sociologi.cjb.net>