[R-sig-ME] Multi Processor / lme4

Sun Apr 27 20:53:21 CEST 2014

On 14-04-27 02:34 PM, Doogan, Nathan wrote:
> Thanks for the info, Ben.
> 
> It sounds like the best way for me to speed things up at the moment, then,
> is to build R with a threaded linear algebra library.
> 
> -Nate

  Actually, I don't know if that will help; most of the difficult linear
algebra in lme4 is sparse linear algebra, handled through the Matrix
package (wrapping Tim Davis's SuiteSparse library) and the RcppEigen
package.  I'm not sure how much of it really uses the standard BLAS
back-end.

  Someone with time on their hands could do some benchmarking and see
what happens.
http://cran.r-project.org/web/packages/gcbd/vignettes/gcbd.pdf‎ is a
good resource.

  If you're really interested in performance, and feeling adventurous, I
would definitely recommend Doug Bates's MixedModels package for Julia.

  It is also the case at the moment that lme4 is slower than lme4.0 for
some problems.

  If you have specific performance questions (rather than just "it would
be nice for lme4 to be faster", which I don't disagree with), it would
be good to give the parameters of your problem -- how many observations,
grouping variables, # of levels of grouping variables, LMM vs GLMM,
structure of grouping variables (nested, crossed, partially crossed) ... ?

  Ben Bolker

> p.s. a duplicate of this message (sent from a non-member email address) is
> waiting for moderation. feel free to delete.
> 
> 
> On Sun, Apr 27, 2014 at 2:00 PM, Ben Bolker <bbolker at gmail.com> wrote:
> 
>>
>>   [This is a perfectly reasonable question for r-sig-mixed-models, so
>> I'm forwarding it there]
>>
>> ===================
>> This might be very naive.
>>
>> I assume a very costly part of estimating parameters is evaluating the
>> likelihood -- particularly when the data are large. It seems like it'd
>> be fairly easy to distribute that evaluation across several processes
>> (i.e., multithread it) to speed up the procedure (e.g., mclapply or pvec
>> in R "parallel" package)
>>
>> That said, I'd guess likelihood evaluation in (g)lmer actually happens
>> somewhere else where I am not so comfortable.
>>
>> Any obvious reasons this map-reduce (I think it's called) sort of
>> technique is not in use?
>>
>> Thanks for your time. -Nate
>>
>> --
>> Nathan Doogan, Ph.D.
>> Post Doctoral Researcher
>> The Colleges of Social Work and Public Health
>> The Ohio State University
>>
>>
>>   It is indeed a little naive, but not silly at all.  The problem is
>> that it is *not* "fairly easy" to distribute the evaluation across
>> processors via map-reduce/mclapply etc..  Doug Bates has already done
>> all kind of wizardry to reduce the likelihood evaluation to linear
>> algebra operations that can be done very efficiently. While speeding the
>> process up enormously been doing some careful profiling with his Julia
>> code, and the largest single cost (if I am recalling things correctly)
>> is computing a sparse Cholesky decomposition.  He has been looking
>> *very* recently at the PaStiX library
>> <http://http://pastix.gforge.inria.fr/> as a possible way to parallelize
>> this operation, but it is not completely trivial.
>>   cheers
>>     Ben Bolker
>>
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>