[R] %dopar% parallel processing experiment

ivo welch ivo.welch at gmail.com
Sat Jul 2 20:04:48 CEST 2011


thank you, uwe.  this is a little disappointing.  parallel processing
for embarrassingly simple parallel operations--those needing no
communication---should be feasible if the thread is not always created
and released, but held.  is there light-weight parallel processing
that could facilitate this?

regards,

/iaw


2011/7/2 Uwe Ligges <ligges at statistik.tu-dortmund.de>:
>
>
> On 02.07.2011 19:32, ivo welch wrote:
>>
>> dear R experts---
>>
>> I am experimenting with multicore processing, so far with pretty
>> disappointing results.  Here is my simple example:
>>
>> A<- 100000
>> randvalues<- abs(rnorm(A))
>> minfn<- function( x, i ) { log(abs(x))+x^3+i/A+randvalues[i] }  ## an
>> arbitrary function
>>
>> ARGV<- commandArgs(trailingOnly=TRUE)
>>
>> if (ARGV[1] == "do-onecore") {
>>    library(foreach)
>>    discard<- foreach(i = 1:A) %do% uniroot( minfn, c(1e-20,9e20), i ) }
>> else
>> if (ARGV[1] == "do-multicore") {
>>    library(doMC)
>>    registerDoMC()
>>    cat("You have", getDoParWorkers(), "cores\n")
>>    discard<- foreach(i = 1:A) %dopar% uniroot( minfn, c(1e-20,9e20), i ) }
>> else
>> if (ARGV[1] == "plain")
>>    for (i in 1:A) discard<- uniroot( minfn, c(1e-20,9e20), i ) else
>> cat("sorry, but argument", ARGV[1], "is not
>> plain|do-onecore|do-multicore\n")
>>
>>
>> on my Mac Pro 3,1 (2 quad-cores), R 2.12.0, which reports 8 cores,
>>
>>   "plain" takes about 68 seconds (real and user, using the unix timing
>> function).
>>   "do-onecore" takes about 300 seconds.
>>   "do-multicore" takes about 210 seconds real, (300 seconds user).
>>
>> this seems pretty disappointing.  the cores are not used for the most
>> part, either.  feedback appreciated.
>
>
> Feedback is that a single computation within your foreach loop is so quick
> that the overhead of communicating data and results between processes costs
> more time than the actual evaluation, hence you are faster with a single
> process.
>
> What you should do is:
>
> write code that does, e.g., 10000 iterations within 10 other iterations and
> just do a foreach loop around the outer 10. Then you will probably be much
> faster (without testing). But this is essentially the example I am using for
> teaching to show when not to do parallel processing.....
>
> Best,
> Uwe Ligges
>
>
>
>
>
>
>> /iaw
>>
>>
>> ----
>> Ivo Welch (ivo.welch at gmail.com)
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list