[R] More than doubling performance with snow

Fri Nov 28 15:32:26 CET 2008

Hi Markus,

I'm happy to participate in this, as I think I said previously.

I won't have time to look carefully at the draft until sometime next
week, but I remain puzzled about the high time listed for case 3 with
snow/Rmpi.  It would be good to understand what is going on there --
the discrepancy between show/Rmpi and the other snow variants seems
odd.

I'm not sure how meaningful the timing comparisons are overall.  The
differences are mainly overhead due to additional feature and
communication difference.  The feature-related overhead is not likely
to be important in any real examples. In my experience, if
communication is an issue in a substantial (i.e. realistic)
computaiton, then a more sophisticated approach than simple
scatter-compute-gather is needed, and then the ability to express such
an approach becomes more important than the performance per se.

Best,

luke

On Mon, 24 Nov 2008, Markus Schmidberger wrote:

> Hi,
>
> there is a new mailing list for R and HPC: r-sig-hpc at r-project.org
> This is probably a better list for this question. Do not forget, first
> of all you have to register: https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
>
> In this case the communication overhead is the problem. The data /
> matrix is to big!
> Have a look to the function snow.time to visualize your communication
> and calculation time. It is a new function in snow_0.3-4.
> ( http://www.cs.uiowa.edu/~luke/R/cluster/ )
>
> Best
> Markus
>
>
>
> Stefan Evert wrote:
>>
>>> I'm sorry but I don't quite understand what "not running solve() in
>>> this process" means. I updated the code and it do show that the result
>>> from clusterApply() are identical with the result from lapply(). Could
>>> you please explain more about this?
>>
>> The point is that a parallel processing framework like Snow and PVM does
>> not execute the operation in your (interactive) R session, but rather
>> starts separate computing processes that carry out the actual
>> calculation (while your R session is just waiting for the results to
>> become available).  These separate processes can either run on different
>> computers in a network, or on your local machine (in order to make use
>> of multiple CPU cores).
>>
>>>>> user  system elapsed
>>>>> 0.584   0.144   4.355
>>
>>>>> user  system elapsed
>>>>> 4.777   0.100   4.901
>>
>>
>> If you take a close look at your timing results, you can see that the
>> total processing time ("elapsed") is only slightly shorter with
>> parallelisation (4.35 s) than without (4.9 s).  You've probably been
>> looking at "user" time, i.e. the amount of CPU time your interactive R
>> session consumed.  Since with parallel processing, the R session itself
>> doesn't perform the actual calculation (as explained above), it is
>> mostly waiting for results to become available and "user" time is
>> therefore reduced drastically.  In short, when measuring performance
>> improvements from parallelisation, always look at the total "elapsed" time.
>>
>> So why isn't parallel processing twice as fast as performing the
>> caculation in a single thread? Perhaps the advantage of using both CPU
>> cores was eaten up by the communication overhead.  You should also take
>> into account that a lot of other processes (terminals, GUI, daemons,
>> etc.) are running on your computer at the same time, so even with
>> parallel processing you will not have both cores fully available to R.
>> In my experience, there is little benefit in parallelisation as long as
>> you just have two CPU cores on your computer (rather than, say, 8 cores).
>>
>> Hope this clarifies things a bit (and is reasonably accurate, since I
>> don't have much experience with parallelisation),
>> Stefan
>>
>> [ stefan.evert at uos.de | http://purl.org/stefan.evert ]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
> --
> Dipl.-Tech. Math. Markus Schmidberger
>
> Ludwig-Maximilians-Universität München
> IBE - Institut für medizinische Informationsverarbeitung,
> Biometrie und Epidemiologie
> Marchioninistr. 15, D-81377 Muenchen
> URL: http://www.ibe.med.uni-muenchen.de
> Mail: Markus.Schmidberger [at] ibe.med.uni-muenchen.de
> Tel: +49 (089) 7095 - 4599
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:      luke at stat.uiowa.edu
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu