[R-sig-hpc] Easiest Road to Parallel R?

Dirk Eddelbuettel edd at debian.org
Tue Jul 21 15:18:54 CEST 2009


Tom,

On 21 July 2009 at 08:33, Thomas Hampton wrote:
| We have a substantial beowulf cluster and would like to
| get parallel R going. Our systems administrators attempted
| without success to get the R function papply to run
| properly without success. I passed their comments/questions on to this
| list in a previous message. The way I understand it, the various  
| pieces are
| there and report no errors, but the final result is that no  
| parallelism is achieved.
| 
| Is there some more bullet-proof route to parallel R than mpich2, Rmpi  
| and papply?
| 
| We are on a beowulf cluster, red hat linux.

There are a few questions here that may profit from separation:

0: Should you use R in parallel?  Yup, so that's a given.

1: What _software level_ is recommended?  If you follow the Schmidberger et
al survey paper (linked from the CRAN Task View on High Performance Computing
and otherwise to be had via Google or 'real soon now' at JSS) then you land
at Rmpi and Snow.

2: Given a stated preference for Rmpi, how do you get it going?  Hao Yu does
an admirable job trying to let the configure script find Open MPI, LAM,
MPICH2, DeinoMPI, ...   I have had good results with Rmpi on Debian and
Ubuntu but had to at times makes changes to the configure script which Hao
then incorporated.  Rmpi and friends tend to work out of the box on Debian
and Ubuntu, using the binaries provided by the distro.

3: Given that you are on RH system, maybe you should also seek help on
r-sig-fedora for the distro-specific hints.  

And as a general rule that we echo often here, test components in 'layers'.
I.e. before attempting to get Rmpi installed, verify that you actually send
an MPI variant of "hello, world" around etc.

Hth, Dirk

-- 
Three out of two people have difficulties with fractions.



More information about the R-sig-hpc mailing list