[R-sig-ME] How can I make R using more than 1 core (8 available) on a Ubuntu Rstudio server ?

Ben Bolker bbolker at gmail.com
Thu Jan 18 21:36:08 CET 2018


  Explaining a little bit more; unlike a lot of informatics/machine
learning procedures, the algorithm underlying lme4 is not naturally
parallelizable. There are components that *could* be done in parallel,
but it's not simple.

  If you need faster computation, you could either try Doug's
MixedModels.jl package for Julia, or the glmmTMB package (on CRAN),
which may scale better than glmer for problems with large numbers of
fixed-effect parameters (although my guess is that it's close to a tie
for the problem specs you quote below, unless your fixed effects are
factors with several levels).

  Sometimes installing better-optimized linear algebra libraries or
better-optimized builds of R can help (optimized BLAS or Microsoft's
"R Open"), although likely not in the case of lme4.

  My other comment is that a lot of the computational load of modeling
has to do with running lots of different models, not with how long a
single model takes.  For example,

 - likelihood profiling
 - parametric bootstrapping
 - model comparison and testing via likelihood ratio tests or
information criteria
 - model selection (ugh)

Are all procedures that can be easily parallelized (support for
parallel computation is built-in for the first two).

  cheers
    Ben Bolker

On Thu, Jan 18, 2018 at 3:07 PM, Douglas Bates <bates at stat.wisc.edu> wrote:
> The procedure is fairly simple - just rewrite the lme4 package from
> scratch. :-)
>
> On Thu, Jan 18, 2018 at 2:03 PM Nicolas Bédère <n.bedere at gmail.com> wrote:
>
>> I want to run the *glmer* procedure on a “large” dataset (250,000
>> observations). The model includes 5 fixed effects, 2 interactions terms and
>> 3 random effects. It takes more than 15 min to run on my laptop (recent
>> intel core i7, RAM = 4GO). Thus, the IT department of the University I am
>> working at developed a Rstudio server based on the Ubuntu system. My
>> problem is that 8 cores are available on this server but when I run the
>> *glmer
>> *procedure, only 1 of them is being used and it takes more than 1h to get
>> the results... How can I solve that problem and improve time efficiency? I
>> found on google I may have to use the parallel procedure but (i) I am not
>> familiar at all with those informatics procedures and they look a bit
>> complicated, (ii) the code I picked works with other functions in other
>> packages such as *kmeans{stats}* (
>>
>> https://stackoverflow.com/questions/29998718/how-can-i-make-r-use-more-cpu-and-memory
>> )
>> but neither with *lmer *nor *glmer.*
>>
>>
>>
>> Can you please help with a simple procedure to tackle the problem?
>>
>>
>> Many thanks !
>>
>>         [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models



More information about the R-sig-mixed-models mailing list