[R] More than doubling performance with snow

Hesen Peng hesen.peng at emory.edu
Tue Nov 25 21:27:34 CET 2008


I see. Thank you very much.

On Mon, Nov 24, 2008 at 10:12 AM, Stefan Evert <stefan.evert at uos.de> wrote:
>
>> I'm sorry but I don't quite understand what "not running solve() in
>> this process" means. I updated the code and it do show that the result
>> from clusterApply() are identical with the result from lapply(). Could
>> you please explain more about this?
>
> The point is that a parallel processing framework like Snow and PVM does not
> execute the operation in your (interactive) R session, but rather starts
> separate computing processes that carry out the actual calculation (while
> your R session is just waiting for the results to become available).  These
> separate processes can either run on different computers in a network, or on
> your local machine (in order to make use of multiple CPU cores).
>
>>>> user  system elapsed
>>>> 0.584   0.144   4.355
>
>>>> user  system elapsed
>>>> 4.777   0.100   4.901
>
>
> If you take a close look at your timing results, you can see that the total
> processing time ("elapsed") is only slightly shorter with parallelisation
> (4.35 s) than without (4.9 s).  You've probably been looking at "user" time,
> i.e. the amount of CPU time your interactive R session consumed.  Since with
> parallel processing, the R session itself doesn't perform the actual
> calculation (as explained above), it is mostly waiting for results to become
> available and "user" time is therefore reduced drastically.  In short, when
> measuring performance improvements from parallelisation, always look at the
> total "elapsed" time.
>
> So why isn't parallel processing twice as fast as performing the caculation
> in a single thread? Perhaps the advantage of using both CPU cores was eaten
> up by the communication overhead.  You should also take into account that a
> lot of other processes (terminals, GUI, daemons, etc.) are running on your
> computer at the same time, so even with parallel processing you will not
> have both cores fully available to R.  In my experience, there is little
> benefit in parallelisation as long as you just have two CPU cores on your
> computer (rather than, say, 8 cores).
>
> Hope this clarifies things a bit (and is reasonably accurate, since I don't
> have much experience with parallelisation),
> Stefan
>
> [ stefan.evert at uos.de | http://purl.org/stefan.evert ]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
彭河森 Hesen Peng
http://hesen.peng.googlepages.com/


More information about the R-help mailing list