[R-sig-hpc] snow/snowfall: advantages of MPI over the socket method?

Thu Jan 7 00:22:56 CET 2010

Dear R-HPC list,

I've been using snowfall via the socket method quite a bit the last
year for a lot of my simulation studies.  The cluster I work on has
SGE and I've been extracting information from the "qhost" command to
find idle nodes to spawn R instances via sfInit().  Yes, I know this
may turn some heads for system admins but I've been diligent in not
hogging the shared resources -- this workflow has served me well.

I've also been working with a system admin at my school to try to get
OpenMPI running with snow/snowfall.  At first, we were going to go
with the LAM/MPI implementation for use with the "sfCluster" unix
program but decided to go with OpenMPI since it is being actively
developed (and is possible to work with SGE).  I think things are
working.  I am aware that you can run batch programs using this
method, but I prefer to run the interactive session (via mpirun with
RMPISNOW).

Some things I noticed that are different compared to the socket method:
1.  I declare the number of cpus and nodes before my R session is
launched via the mpirun command.  This launches that many R instances
on the different nodes (master + workers/slaves).
2.  If there is an error in the R code (eg, object not found), the R
session terminates.
3.  I use sfInit() to declare the number of cpus to use from the
cluster created in 1.

Are there any advantages of the MPI method over the socket method?
The only advantages I see are to be able to use sfCluster with LAM/MPI
to select idle nodes from step 1 or to use mpirun/RMPISNOW with SGE to
manage the shared resources.  Aside from these, my experience is that
the socket method is the easier and even better (comment 2) method.
Please let me know otherwise.

Thanks.
Vinh