[R-sig-hpc] I can do mpi.apply but not foreach with doMPI

Stephen Weston stephen.b.weston at gmail.com
Wed Aug 26 15:35:14 CEST 2015

Hi Seija,

To use doMPI, you shouldn't start the workers in an Rprofile as
described in the Rmpi documentation since those workers can only be
used by functions in the Rmpi package such as mpi.apply. In other
words, you don't want to see messages like:

  master (rank 0, comm 1) of size 4 is running on: c1
  slave1 (rank 1, comm 1) of size 4 is running on: c1

When using doMPI from an interactive R session, the workers shouldn't
be started until you execute "startMPIcluster()":

 > cl <- startMPIcluster(3)
         3 slaves are spawned successfully. 0 failed.

What MPI implementation are you using?  I believe the only reason to
start the workers in an Rprofile is if your MPI implementation doesn't
have spawn support. Open MPI has great spawn support and is able to
spawn workers from an interactive R session.  MPICH2 has spawn
support, but I believe it can only spawn workers if the R session was
started via mpirun, so I think that Open MPI is preferable for use
with Rmpi.

Even if you're using an MPI implementation without spawn support,
doMPI can use workers that are all started by mpirun. However, you do
need spawn support to start workers from an interactive R with doMPI.


Steve Weston

On Wed, Aug 26, 2015 at 4:12 AM, Seija Sirkiä <seija.sirkia at csc.fi> wrote:
> Hi all,
> I'm trying to learn to do parallel computing with R and foreach on this cluster of ours but clearly I'm doing something wrong and I can't figure out what.
> Briefly, I'm sitting on a Linux cluster, about which the user guide says that the login nodes are based on the RHEL6, while the computing nodes use CentOS 6. Jobs are submitted using SLURM.
> So there I go, requesting a short interactive test session using:
> srun -p test -n4 -t 0:15:00 --pty Rmpi --no-save
> Here Rmpi is the modified R_home_dir/bin/R mentioned in the Rprofile file that comes with Rmpi ("This R profile can be used when a cluster does not allow spawning --- Another way is to modify R_home_dir/bin/R by adding...").
> When my session starts, I get these messages:
> master (rank 0, comm 1) of size 4 is running on: c1
> slave1 (rank 1, comm 1) of size 4 is running on: c1
> slave2 (rank 2, comm 1) of size 4 is running on: c1
> slave3 (rank 3, comm 1) of size 4 is running on: c1
> before the prompt. Sounds good, and if I go check top on the c1 node, there I see 3 R's churning away happily at 100% cpu time, and one not doing much. As it should be, as far as I can tell?
> If I then run this little test:
> funtorun<-function(k) {
>   system.time(sort(runif(1e7)))
> }
> system.time(a<-mpi.apply(1:3,funtorun))
> a
> b<-a
> system.time(for(i in 1:3) b[[i]]<-system.time(sort(runif(1e7))))
> b
> it goes through nicely, and the mpi.apply part takes about 2.6 seconds in total, with each of the 3 sorts taking about that same time, while the latter for-loop takes about 7 seconds in total, each of the three sorts taking about 2.3 seconds. Nice, that tells me the workers will do stuff, simultaneously, when requested correctly.
> But if I try this instead:
> library(doMPI)
> cl<-startMPIcluster()
> registerDoMPI(cl)
> system.time(a<-foreach(i=1:3) %dopar% system.time(sort(runif(1e7))))
> it just hangs up at the foreach line, and never gets through, and only gets killed at the end of the reserved 15 minutes or when I scancel the whole job myself. None of the lines give any errors.
> So what am I doing wrong? I have a hunch this has something to do with how my workers are started, since I never get to do those mpirun commands that the doMPI manual speaks of. But despite my efforts of reading the manual and the documentation of startMPIcluster I haven't figured out what else to try.
> Many thanks in advance for your time!
> BR,
> Seija Sirkiä
> _______________________________________________
> R-sig-hpc mailing list
> R-sig-hpc at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc

More information about the R-sig-hpc mailing list