[R-sig-hpc] Why pure computation time in parallel is longer than the serial version?

George Ostrouchov georgeost at gmail.com
Thu Feb 13 18:21:34 CET 2014


Consider using pbdR. It puts PBLAS and ScaLAPACK at your disposal for 
FORTRAN speed matrix parallelism without the need to learn their API. 
While built for truly big machines, you will see a lot of benefit with 
your size machine already. Start with pbdDEMO to learn the basics. It is 
batch computing (because that's what's done on big machines) with 
Rscript but the speed and simplicity are worth it!

Cheers,
George

On 2/13/14 2:32 AM, romunov wrote:
> When doing calculations in parallel, there's also some overhead costs. If
> computation time per core is short, the overhead costs may exceed the
> computation time, time-wise raising the cost of parallel task.
>
> Cheers,
> Roman
>
>
> On Thu, Feb 13, 2014 at 5:26 AM, Xuening Zhu <puddingnnn529 at gmail.com>wrote:
>
>> I am learning about parallel computing in R , and I found this happening in
>> my experiments.
>>
>> Briefly, in the following example, why are most values of user time in
>> t smaller
>> than that in mc_t ? My machine has 32GB memory, 2 cpus with 4 cores and 8
>> hyper threads in total. Tools such as BLAS to enhance performance aren't
>> installed as well.
>>
>> system.time({t = lapply(1:4,function(i) {
>>      m = matrix(1:10^6,ncol=100)
>>      t = system.time({
>>          m%*%t(m)
>>      })
>>      return(t)})})
>>
>>
>> library(multicore)
>> system.time({
>>      mc_t = mclapply(1:4,function(m){
>>          m = matrix(1:10^6,ncol=100)
>>          t = system.time({
>>              m%*%t(m)
>>          })
>>          return(t)
>>      },mc.cores=4)})
>>
>>> t[[1]]
>> user  system elapsed
>>
>> 11.136   0.548  11.703
>>
>> [[2]]
>> user  system elapsed
>>
>> 11.533   0.548  12.098
>>
>> [[3]]
>> user  system elapsed
>>
>> 11.665   0.432  12.115
>>
>> [[4]]
>> user  system elapsed
>>
>> 11.580   0.512  12.115
>>
>>> mc_t[[1]]
>> user  system elapsed
>>
>> 16.677   0.496  17.199
>>
>> [[2]]
>> user  system elapsed
>>
>> 16.741   0.428  17.198
>>
>> [[3]]
>> user  system elapsed
>>
>> 16.653   0.520  17.198
>>
>> [[4]]
>> user  system elapsed
>>
>> 11.056   0.444  11.520
>>
>> mc_t and t measures pure computation time according to my comprehension.
>>   Such things happens to parLapply in parallel package as well. The memory
>> in my machine is enough for that computation. (It just takes a few percent
>> of that).
>>
>> Also I try to run 4 similar Rscript as below by hand using command
>> 'Rscript' at the same time on the same machine and save the results. The
>> elapsed time for each of them is about 12s as well. So I think it may not
>> be the contention of cores.
>>
>> system.time({t = lapply(1,function(i) {
>>      m = matrix(1:10^6,ncol=100)
>>      t = system.time({
>>          m%*%t(m)
>>      })
>>      return(t)})})
>>
>> So what happened during the parallel? Does mc_t really measure the pure
>> computation time? Can someone explain the whole process by step in detail?
>>
>> Thanks.
>>
>> --
>>
>> Xuening Zhu
>>
>>          [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-hpc mailing list
>> R-sig-hpc at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
>>
>
>



More information about the R-sig-hpc mailing list