[R-sig-hpc] slurm resource use and mpi.spawn.Rslaves

Martin Morgan mtmorgan at fhcrc.org
Thu Jul 23 21:06:22 CEST 2009


Hi --

I'm trying to understand how slurm interacts with Rmpi. One way of using
Rmpi is

  salloc -n 20 \
    mpirun -n 1 \
      R -e "library(Rmpi); mpi.spawn.Rslaves(); mpi.quit()"

I understand this to say 'give me 20 cores, and on 1 launch an R that
spawns slaves to fill the universe with 20 additional R'.

The question is about slurm accounting -- does slurm think that it has
20 cores that are now 'in use', even though mpirun launches only a
single job, i.e., are the spawned R processes sneaking in unnoticed by
slurm?

I'm also wondering how slurm does its accounting with salloc -N 20
mpirun -n 1 ..., both when R launches mpi.universe.size() [which is 20,
despite the multiple cores available] slaves and when mpi.spawn.Rslaves
over-allocates.

Any help appreciated.


Martin



More information about the R-sig-hpc mailing list