[R-sig-hpc] Problems parallelizing glmnet

Max Kuhn mxkuhn at gmail.com
Fri Sep 7 00:55:26 CEST 2012


I can tell you from my testing with caret that there is considerable
speedup using foreach. See Figure 3 of

  http://cran.r-project.org/web/packages/caret/vignettes/caretTrain.pdf

Of course it is model dependent but I have yet to see it slow down
computations (but I' sure it is possible).

Max

On Thu, Sep 6, 2012 at 5:09 PM, Peter Langfelder
<peter.langfelder at gmail.com> wrote:
> On Thu, Sep 6, 2012 at 1:58 PM, Zachary Mayer <zach.mayer at gmail.com> wrote:
>> In this case, each iteration of the function is very quick:
>>> system.time(summary(lm(y ~ x[,1]))$coefficients[2,4])
>>    user  system elapsed
>>    0.01    0.00    0.02
>>
>> And you are doing 10,000 iterations, so overhead matters a lot.  In the
>> glmnet problem, each iteration of the function is very slow, and you are
>> doing 8 iterations, so overhead doesn't matter at all.
>>
>> Finally, I suspect that using the doMC foreach backend will improve things
>> considerably, but I can't currently test that.
>
> FWIW, the foreach construct itself (without any parallel backend) is
> quite slow and I would not use to loop over a large number of quick
> calculations.
>
> Peter
>
> _______________________________________________
> R-sig-hpc mailing list
> R-sig-hpc at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc



-- 

Max



More information about the R-sig-hpc mailing list