[R-sig-hpc] simple question on R/Rmpi/snow/slurm configuration

Dirk Eddelbuettel edd at debian.org
Mon Jan 5 22:40:44 CET 2009


On 5 January 2009 at 16:04, Whit Armstrong wrote:
| > library(Rmpi)
| library(Rmpi)
| [linuxsvr.kls.corp:09097] mca: base: component_find: unable to open
| osc pt2pt: file not found (ignored)
| > library(snow)
| library(snow)
| >  cl <- getMPIcluster()
|  cl <- getMPIcluster()
| > cl

I don't think that works.  You need to be explicit in the creation of the
cluster.  The best trick I found was in re-factoring / abstracting-out what
snow does in its internal scripts. I showed that in the UseR talk (as opposed
to tutorial) and picked it up in last months presentation. It goes as
follows:

-----------------------------------------------------------------------------
#!/usr/bin/env r

suppressMessages(library(Rmpi))
suppressMessages(library(snow))

#mpirank <- mpi.comm.rank(0)    # just FYI
ndsvpid <- Sys.getenv("OMPI_MCA_ns_nds_vpid")
if (ndsvpid == "0") {                   # are we master ?
    #cat("Launching master (OMPI_MCA_ns_nds_vpid=", ndsvpid, " mpi rank=",     mpirank, ")\n")
    makeMPIcluster()
} else {                                # or are we a slave ?
    #cat("Launching slave with (OMPI_MCA_ns_nds_vpid=", ndsvpid, " mpi rank=", mpirank, ")\n")
    sink(file="/dev/null")
    slaveLoop(makeMPImaster())
    q()
}

## a trivial main body, but note how getMPIcluster() learns from the
## launched cluster how many nodes are available
cl <- getMPIcluster()
clusterEvalQ(cl, options("digits.secs"=3))      ## use millisecond
## granularity
res <- clusterCall(cl, function() paste(Sys.info()["nodename"],
## format(Sys.time())))
print(do.call(rbind,res))
stopCluster(cl)
-----------------------------------------------------------------------------

which you can launch via salloc as Martin suggested to create a slurm
allocation. Then use orterun to actually use it have orterun call your
script. I tend to wrap things into littler script.  I.e. something like

      $ salloc -w host[1-32] orterun -n 8 nameOfTheScriptAbove.r

where you should then see 7 hosts (as one acts as the dispatching controller,
so you get N-1 working out of N assigned by orterun).

This has the advantage of never hard-coding how many nodes you use. It is all
driven from the commandline.  If you always have the same fixed nodes, then
it easier to just use the default snow cluster creation with hard-wired
nodes.

Hth,  Dirk


-- 
Three out of two people have difficulties with fractions.



More information about the R-sig-hpc mailing list