[R-sig-hpc] openmpi/rmpi/snow: current puzzles, possible improvements [partial solution]

Ross Boylan ross at biostat.ucsf.edu
Mon May 18 19:46:17 CEST 2009


Now that r/mpi is on the remote nodes, I got things mostly running.
On Fri, 2009-05-15 at 16:07 -0700, Ross Boylan wrote:
> I think there were several things wrong.
> 1) I wasn't exporting R_PROFILE to the remote nodes.
That matters.  Oddly, without it the slaves show
[n5:12779] mca: base: component_find: unable to open osc pt2pt: file not found (ignored)
[1] "Started Slave"
[n5:12779] OOB: Connection to HNP lost
The first line is standard and the last line came when I head ^c on the head.

The "started slave" message does not appear when R_PROFILE is exported.
This suggests that R has some ability to tell when it's started under
MPI.  Maybe it loads something from Rmpi (as opposed to Rsnow), but I
don't see the "Started Slave" message in the Rmpi profile either.

At any rate, the different nodes can't talk to each other, probably
because the slaves aren't even running the snow slave loop.

> 2) R CMD BATCH's output file was the same file for all processes, given
> NFS.
This required a double indirection to get around, since paramters on the
mpirun command line were either evaluated on the head node or not at
all.


So:
R_PROFILE=/usr/lib/R/site-library/snow/RMPISNOWprofile; export R_PROFILE
mpirun -np 5 -host n7,n5  ./innersilly

--------------- innersilly --------------------------
#! /bin/sh
/usr/bin/R CMD BATCH silly.R silly-$(hostname)-$$.out
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Simply sending the output to a local drive doesn't work if there is more
than one process on a machine.  It would be preferable to use the rank
in place of $$, but that's not reliably available in OpenMPI 1.2.  This
was with R 2.7.1 on Debian Lenny.

There was one remaining issue with output redirection, which I'll put in
a separate thread.

Ross

> 3) The remote nodes did not have Rmpi installed!
> 
> 3) is obviously crucial; I'm not sure how significant the other problems
> are.  I diagnosed it by changing the output file to /tmp/foo and running
> only one job on each node.
> 
> Is there a good way to get unique file names per process on the command
> line?  The only way I can think of is to determine the output file
> inside the batch script invoked by mpirun and using an env variable, if
> one is available (i.e., OpenMPI 1.3 or 1.2 in some scenarios)
> 
> My new invocation looks like this:
> R_PROFILE=/usr/lib/R/site-library/snow/RMPISNOWprofile; export R_PROFILE
> mpirun -np 2 -host n5,n7 -x R_PROFILE /usr/bin/R CMD BATCH silly.R
> 
> I think the R CMD BATCH will send output to stdout and mpi will redirect
> to the invoking terminal.  Since I can't actually run because of 3),
> this is speculative.
> 
> Ross
> 
> On Wed, 2009-05-13 at 21:52 -0700, Ross Boylan wrote:
> > After reading through the thread around
> > https://stat.ethz.ch/pipermail/r-sig-hpc/2009-February/000105.html, as
> > well as looking at some other things, for ideas about running snow on
> > top of Rmpi on Debian Lenny, I decided to try a shell script:
> > ----------------------------------------------------------------
> > R_PROFILE=/usr/lib/R/site-library/snow/RMPISNOWprofile; export R_PROFILE
> > mpirun -np 6 -hostfile hosts R CMD BATCH snowjob.R snowjob.out
> > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > with this kind of snowjob.R:
> > -------------------------------------------------------------------
> > # This will only execute on the head node
> > cl <- getMPIcluster()
> > print(mpi.comm.rank(0))
> > 
> > quickinfo <- function() {
> >   list(rank=mpi.comm.rank(0), machine=Sys.info()) #system("hostname"))
> > }
> > print(clusterCall(cl, quickinfo))
> > stopCluster(cl)
> > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > 
> > and hosts file
> > -------------------
> > n7 slots=3
> > n5 slots=0  # changing this to 2 didn't help
> > n4 slots=4
> > ^^^^^^^^^^^^^^^^^^^
> > 
> > I'm on n7.
> > 
> > Two problems.
> > 
> > First, the job shown never terminates. snowjob.out shows the standard R
> > banner, a standard harmless complaint, and then nothing (technically it
> > shows 
> > [n7:14829] OOB: Connection to HNP lost
> > but I assume that is after I ^c my shell script).
> > 
> > I suspect the problem is that it's having trouble reaching the other
> > nodes.
> > 
> > Second, if I have n7 slots=7 the job completes.  It shows everything on
> > n7.  However, if I use machine=system("hostname") I get back null
> > strings.  system("hostname") works fine interactively.
> > 
> > Perhaps this is some kind of quoting effect when system("hostname") is
> > exported via clusterCall?  Or system() doesn't work under rmpi?
> > 
> > I'm also not sure why I am not running into a 3rd problem: it looks as
> > if each process should be writing to the same file snowjob.out (via NFS
> > mounts).  That doesn't seem to be happening.  Perhaps because the slave
> > R's never make it out of the RMPISNOWProfile code?
> > 
> > If anyone has any thoughts or suggestions, I'd love to hear them.
> > 
> > Ross
> > 
> > P.S. The original problem is that, apparently, makeCluster(n,
> > type="MPI") will not spawn jobs on other nodes--maybe even not more than
> > one job spawned at all.  So I'm attempting to bring up snow within an
> > mpi session.
> > 
> > I did notice the docs on MPI_COMM_SPAWN
> > http://www.mpi-forum.org/docs/mpi21-report-bw/node202.htm#Node202
> > indicate there is an info argument which could contain system-dependent
> > information.  Presumably this could include a hostname; the standard
> > explicitly leaves this to the implementation.  I couldn't find anything
> > on the openmpi implementation.  I suppose the source would at least
> > indicate what works now.
> > 
> > So, IF openmpi supports it, and if the interface is exposed through Rmpi
> > (which does have mpi.info functions, which might be able to make the
> > right arguments), there would be a possibility of handling this strictly
> > within R.
> > 
>



More information about the R-sig-hpc mailing list