[R] R on a computer cluster

Jay Emerson jayemerson at gmail.com
Sun Feb 17 18:33:16 CET 2008


Gabriele,

In addition to the suggestions from Markus (below), there is
NetWorkSpaces (package nws).  I have used both nws and snow together
with a package I'm developing (bigmemoRy) which allocates matrices to
shared memory (helping avoid the bottleneck Markus alluded to for
processors on the same computer).  Both seem quite easy to use,
essentially only needing one command to initiate the "cluster" and
then one command to do something like apply() in parallel.  It takes a
little planning of your application, but the "painfully obvious"
parallel problem should be painless to implement.

Jay




Hi,

your required performance is strongly depending on your application.
If you talk about a cluster, you should think about several computers.
Not only one computer with several processors.

If you have several computers. First of all you have to decide for a
communication protocol for parallel computing: MPI, PVM, ...
Then you have to install this at your computers. I think you should use
MPI and one of its implementations: OpenMPI, LamMPI
Then there are several R packages for using the communication protocols:
Rmpi, snow, Rpvm, ...

If you have one computer with severals processors, you can do the same
thinks. But then you have only shared memory (bottleneck) and there is
not to much improvement in performance. R is not yet implemented for
multiple-processors. There is one first, experimental R package using
openMP for multi threading: pnmath
(http://www.stat.uiowa.edu/~luke/R/experimental/)

Some useful links:
http://www.stats.uwo.ca/faculty/yu/Rmpi/
http://ace.acadiau.ca/math/ACMMaC/Rmpi/
http://www.open-mpi.org/
http://www.personal.leeds.ac.uk/~bgy1mm/MPITutorial/MPIHome.html

Best regards
Markus

gabriele.accetta at virgilio.it schrieb:
> Dear all,
>
> I usually run  R on my laptop with Windows XP Professional.
> Now I really want to run  R on a computer cluster (4 processors) with
> Suse Linux Enterprise ver. 10.   But I  am new with computer cluster.
>
>
> Should I modify my functions in order to use the greater
> performance
> and availability than that provided by my laptop?
>
>
> Is there any R
> manual  on parallel computations on multiple-processor?
> Any suggestion
> on a basic tutorial on this topic?
>
> Thank you.
>
>

-- 
John W. Emerson (Jay)
Assistant Professor of Statistics
Director of Graduate Studies
Department of Statistics
Yale University
http://www.stat.yale.edu/~jay
REvolution Computing, Statistical Consultant



More information about the R-help mailing list