[R-sig-ME] mcmcglmm and parallel chains

Tue Jul 31 05:37:11 CEST 2012

On Wed, Jun 13, 2012 at 1:16 AM, Ben Bolker <bbolker at gmail.com> wrote:
> Hans Ekbrand <hans at ...> writes:
>> I am learning mcmcglmm in order to use it on a beowulf cluster.
>>
>> In https://stat.ethz.ch/pipermail/r-sig-mixed-models/2011q3/006558.html
>>
>> Jarrod Hadfield writes:
>>
>> "You can merge MCMC chains from multiple runs, although you should make
>> sure you start them from different initial values"
>>
>> Is it sufficient to provide differents random seeds for each run,
>> or does this refer to the start parameter of mcmcglmm()?
>>
>>    start: optional list having 4 possible elements: ‘R’ (R-structure)
>>           ‘G’ (G-structure) and ‘liab’ (latent variables or
>>           liabilities) should contain the starting values where ‘G’
>>           itself is also a list with as many elements as random effect
>>           components. The fourth element ‘QUASI’ should be logical: if
>>           ‘TRUE’ starting latent variables are obtained heuristically,
>>           if ‘FALSE’ then they are sampled from a Z-distribution
>>
>
>   It depends a bit on what your computational issues are.  It would
> probably be _better_ to use multiple starting points, but if you are
> sure you have no problem with burn-in then you can start all the chains
> at the same points and rely on the different random-number seeds to allow
> the chains to explore parameter space independently.  (Using multiple
> starting points would would also allow you to use the Gelman-Rubin
> diagnostic to assess convergence.)

Is this a reasonable approach to fitting large cross classified
logistic models?  I am exploring moving a model to MCMCglmm; however,
it already runs slowly using more traditional methods like glmer().  I
have access to a cluster so it would not be difficult to split the
chains across many cores.

I guess I would like to know what are the issues with that approach
and what should I profile to see if that is a reasonable way to
improve performance?

Thanks!

Josh

>    I would do some experiments with MCMCglmm to ensure that you know
> how random seeds work with it (i.e. that you get identical answers
> if and only if random seeds are set the same).  You may also want/need
> to look at some of the comments in the high performance task view about
> random number streams for parallel computation.
>
>
> to look into
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/