[R-sig-ME] How can I make R using more than 1 core (8 available) on a Ubuntu Rstudio server ?

Sun Jan 21 00:24:15 CET 2018

On Thu, Jan 18, 2018 at 03:36:08PM -0500, Ben Bolker wrote:
>   Explaining a little bit more; unlike a lot of informatics/machine
> learning procedures, the algorithm underlying lme4 is not naturally
> parallelizable. There are components that *could* be done in parallel,
> but it's not simple.
> 
>   If you need faster computation, you could either try Doug's
> MixedModels.jl package for Julia, or the glmmTMB package (on CRAN),
> which may scale better than glmer for problems with large numbers of
> fixed-effect parameters (although my guess is that it's close to a tie
> for the problem specs you quote below, unless your fixed effects are
> factors with several levels).

I'm currently analysing a few huge datasets and in one of the cases
the outcome was binary (in the other cases, the outcome was count data
so I used negative binomial in glmmTMB), so I tried both glmer and
glmmTMB and glmmTMB was faster. My model included about 11 fixed
effects without interactions and three random intercept terms.

However, I had problem getting a clean convergence when I tried to fit
the model to the complete dataset, both with glmer and glmmTMB, and
what I did might help Nicolas Bédère too. I think the convergence
problems in my case was related to the fact that the outcome was very
rare, only 11.221 cases had the outcome (death), while 5.674.928
didn't have the outcome (the were alive). 

Anyway, I divided the dataset into 8 bins, and fitted the same model
to each dataset, and since I had a 4 core CPU, 4 datasets could be
independently fitted in parallel. Then I took the estimates and
applied Rubin's Rule on them, to get pooled results.

(In my particular case, I left all 11.221 positive cases in each of
the 8 datasets, while each negative case only appeared in one of the 8
datasets.)

I consider what I did as a kind of poor-man's-bootstrapping, but I
would like to have some feedback on the valididity of results one gets
with the method I used. If it is valid, then it is one way of
parallelising glmer.

-- 
Hans Ekbrand, Fil Dr
Epost/email: <hans.ekbrand at gu.se>
Telefon/phone: +46-31 786 47 73
Institutionen för sociologi och arbetsvetenskap, Göteborgs universitet
Department of sociology and work science, Gothenburg university