[R] snow/Rmpi without MPI.spawn?

Thu Sep 4 17:36:21 CEST 2014

Ah, now it's working.  Thanks.  Now I just need to figure out how to get 
snow doing this...

Jim

On 09/04/2014 05:03 AM, Martin Morgan wrote:
> On 09/03/2014 10:24 PM, Leek, Jim wrote:
>> Thanks for the tips.  I'll take a look around for for loops in the 
>> morning.
>>
>> I think the example you provided worked for OpenMPI.  (The default on 
>> our machine is MPICH2, but it gave the same error about calling 
>> spawn.)  Anyway, with OpenMPI I got this:
>>
>>> # salloc -n 12 orterun -n 1 R -f spawn.R
>>> library(Rmpi)
>>> ## Recent Rmpi bug -- should be mpi.universe.size() nWorkers <- 
>>> mpi.universe.size()
>
> (the '## Recent Rmpi bug' comment should have been removed, it's a 
> holdover from when the script was written several years ago)
>
>>> nslaves = 4
>>> mpi.spawn.Rslaves(nslaves)
>
> The argument needs to be named
>
>   mpi.spawn.Rslaves(nslaves=4)
>
> otherwise R matches unnamed arguments by position, and '4' is 
> associated with the 'Rscript' argument.
>
> Martin
>
>> Reported: 2 (out of 2) daemons - 4 (out of 4) procs
>>
>> Then it hung there.  So things spawned anyway, which is progress.  
>> I'm just not sure is that expected behavior for parSupply or not.
>>
>> Jim
>>
>> -----Original Message-----
>> From: Martin Morgan [mailto:mtmorgan at fhcrc.org]
>> Sent: Wednesday, September 03, 2014 5:08 PM
>> To: Leek, Jim; r-help at r-project.org
>> Subject: Re: [R] snow/Rmpi without MPI.spawn?
>>
>> On 09/03/2014 03:25 PM, Jim Leek wrote:
>>> I'm a programmer at a high-performance computing center.  I'm not very
>>> familiar with R, but I have used MPI from C, C++, and Python. I have
>>> to run an R code provided by a guy who knows R, but not MPI. So, this
>>> fellow used the R snow library to parallelize his R code
>>> (theoretically, I'm not actually sure what he did.)  I need to get
>>> this code running on our machines.
>>>
>>> However, Rmpi and snow seem to require mpi spawn, which our computing
>>> center doesn't support.  I even tried building Rmpi with MPICH1
>>> instead of 2, because Rmpi has that option, but it still tries to 
>>> use spawn.
>>>
>>> I can launch plenty of processes, but I have to launch them all at
>>> once at the beginning. Is there any way to convince Rmpi to just use
>>> those processes rather than trying to spawn its own?  I haven't found
>>> any documentation on this issue, although I would've thought it 
>>> would be quite common.
>>
>> This script
>>
>> spawn.R
>> =======
>> # salloc -n 12 orterun -n 1 R -f spawn.R
>> library(Rmpi)
>> ## Recent Rmpi bug -- should be mpi.universe.size() nWorkers <- 
>> mpi.universe.size()
>> mpi.spawn.Rslaves(nslaves=nWorkers)
>> mpiRank <- function(i)
>>     c(i=i, rank=mpi.comm.rank())
>> mpi.parSapply(seq_len(2*nWorkers), mpiRank)
>> mpi.close.Rslaves()
>> mpi.quit()
>>
>> can be run like the comment suggests
>>
>>      salloc -n 12 orterun -n 1 R -f spawn.R
>>
>> uses slurm (or whatever job manager) to allocate resources for 12 
>> tasks and spawn within that allocation. Maybe that's 'good enough' -- 
>> spawning within the assigned allocation? Likely this requires minimal 
>> modification of the current code.
>>
>> More extensive is to revise the manager/worker-style code to 
>> something more like single instruction, multiple data
>>
>>
>> simd.R
>> ======
>> ## salloc -n 4 orterun R --slave -f simd.R
>> sink("/dev/null") # don't capture output -- more care needed here
>> library(Rmpi)
>>
>> TAGS = list(FROM_WORKER=1L)
>> .comm = 0L
>>
>> ## shared `work', here just determine rank and host
>> work = c(rank=mpi.comm.rank(.comm),
>>            host=system("hostname", intern=TRUE))
>>
>> if (mpi.comm.rank(.comm) == 0) {
>>       ## manager
>>       mpi.barrier(.comm)
>>       nWorkers = mpi.comm.size(.comm)
>>       res = list(nWorkers)
>>       for (i in seq_len(nWorkers - 1L)) {
>>           res[[i]] <- mpi.recv.Robj(mpi.any.source(), TAGS$FROM_WORKER,
>>                                     comm=.comm)
>>       }
>>       res[[nWorkers]] = work
>>       sink() # start capturing output
>>       print(do.call(rbind, res))
>> } else {
>>       ## worker
>>       mpi.barrier(.comm)
>>       mpi.send.Robj(work, 0L, TAGS$FROM_WORKER, comm=.comm)
>> }
>> mpi.quit()
>>
>> but this likely requires some serious code revision; if going this 
>> route then
>> http://r-pbd.org/ might be helpful (and from a similar HPC environment).
>>
>> It's always worth asking whether the code is written to be efficient 
>> in R -- a
>> typical 'mistake' is to write R-level explicit 'for' loops that
>> "copy-and-append" results, along the lines of
>>
>>      len <- 100000
>>      result <- NULL
>>      for (i in seq_len(len))
>>          ## some complicated calculation, then...
>>          result <- c(result, sqrt(i))
>>
>> whereas it's much better to "pre-allocate and fill"
>>
>>       result <- integer(len)
>>       for (i in seq_len(len))
>>           result[[i]] = sqrt(i)
>>
>> or
>>
>>       lapply(seq_len(len), sqrt)
>>
>> and very much better still to 'vectorize'
>>
>>       result <- sqrt(seq_len(len))
>>
>> (timing for me are about 1 minute for "copy-and-append", .2 s for 
>> "pre-allocate
>> and fill", and .002s for "vectorize").
>>
>> Pushing back on the guy providing the code (grep for "for" loops, and 
>> look for
>> that copy-and-append pattern) might save you from having to use parallel
>> evaluation at all.
>>
>> Martin
>>
>>>
>>> Thanks,
>>> Jim
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide 
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
>