[R-sig-ME] Fwd: Multi Processor / lme4

Sun Apr 27 20:00:48 CEST 2014

  [This is a perfectly reasonable question for r-sig-mixed-models, so
I'm forwarding it there]

===================
This might be very naive.

I assume a very costly part of estimating parameters is evaluating the
likelihood -- particularly when the data are large. It seems like it'd
be fairly easy to distribute that evaluation across several processes
(i.e., multithread it) to speed up the procedure (e.g., mclapply or pvec
in R "parallel" package)

That said, I'd guess likelihood evaluation in (g)lmer actually happens
somewhere else where I am not so comfortable.

Any obvious reasons this map-reduce (I think it's called) sort of
technique is not in use?

Thanks for your time. -Nate

--
Nathan Doogan, Ph.D.
Post Doctoral Researcher
The Colleges of Social Work and Public Health
The Ohio State University

  It is indeed a little naive, but not silly at all.  The problem is
that it is *not* "fairly easy" to distribute the evaluation across
processors via map-reduce/mclapply etc..  Doug Bates has already done
all kind of wizardry to reduce the likelihood evaluation to linear
algebra operations that can be done very efficiently. While speeding the
process up enormously been doing some careful profiling with his Julia
code, and the largest single cost (if I am recalling things correctly)
is computing a sparse Cholesky decomposition.  He has been looking
*very* recently at the PaStiX library
<http://http://pastix.gforge.inria.fr/> as a possible way to parallelize
this operation, but it is not completely trivial.
  cheers
    Ben Bolker