[R] snow/Rmpi without MPI.spawn?
luke-tierney at uiowa.edu
luke-tierney at uiowa.edu
Thu Sep 4 18:37:47 CEST 2014
You could look into the RMPISNOW shell script that is included in snow
for use with mpirun, eg as
mpirun -np 3 RMPISNOW
The script might need adjusting for your setting.
Best,
luke
On Thu, 4 Sep 2014, Jim Leek wrote:
> Ah, now it's working. Thanks. Now I just need to figure out how to get snow
> doing this...
>
> Jim
>
> On 09/04/2014 05:03 AM, Martin Morgan wrote:
>> On 09/03/2014 10:24 PM, Leek, Jim wrote:
>>> Thanks for the tips. I'll take a look around for for loops in the
>>> morning.
>>>
>>> I think the example you provided worked for OpenMPI. (The default on our
>>> machine is MPICH2, but it gave the same error about calling spawn.)
>>> Anyway, with OpenMPI I got this:
>>>
>>>> # salloc -n 12 orterun -n 1 R -f spawn.R
>>>> library(Rmpi)
>>>> ## Recent Rmpi bug -- should be mpi.universe.size() nWorkers <-
>>>> mpi.universe.size()
>>
>> (the '## Recent Rmpi bug' comment should have been removed, it's a holdover
>> from when the script was written several years ago)
>>
>>>> nslaves = 4
>>>> mpi.spawn.Rslaves(nslaves)
>>
>> The argument needs to be named
>>
>> mpi.spawn.Rslaves(nslaves=4)
>>
>> otherwise R matches unnamed arguments by position, and '4' is associated
>> with the 'Rscript' argument.
>>
>> Martin
>>
>>> Reported: 2 (out of 2) daemons - 4 (out of 4) procs
>>>
>>> Then it hung there. So things spawned anyway, which is progress. I'm
>>> just not sure is that expected behavior for parSupply or not.
>>>
>>> Jim
>>>
>>> -----Original Message-----
>>> From: Martin Morgan [mailto:mtmorgan at fhcrc.org]
>>> Sent: Wednesday, September 03, 2014 5:08 PM
>>> To: Leek, Jim; r-help at r-project.org
>>> Subject: Re: [R] snow/Rmpi without MPI.spawn?
>>>
>>> On 09/03/2014 03:25 PM, Jim Leek wrote:
>>>> I'm a programmer at a high-performance computing center. I'm not very
>>>> familiar with R, but I have used MPI from C, C++, and Python. I have
>>>> to run an R code provided by a guy who knows R, but not MPI. So, this
>>>> fellow used the R snow library to parallelize his R code
>>>> (theoretically, I'm not actually sure what he did.) I need to get
>>>> this code running on our machines.
>>>>
>>>> However, Rmpi and snow seem to require mpi spawn, which our computing
>>>> center doesn't support. I even tried building Rmpi with MPICH1
>>>> instead of 2, because Rmpi has that option, but it still tries to use
>>>> spawn.
>>>>
>>>> I can launch plenty of processes, but I have to launch them all at
>>>> once at the beginning. Is there any way to convince Rmpi to just use
>>>> those processes rather than trying to spawn its own? I haven't found
>>>> any documentation on this issue, although I would've thought it would be
>>>> quite common.
>>>
>>> This script
>>>
>>> spawn.R
>>> =======
>>> # salloc -n 12 orterun -n 1 R -f spawn.R
>>> library(Rmpi)
>>> ## Recent Rmpi bug -- should be mpi.universe.size() nWorkers <-
>>> mpi.universe.size()
>>> mpi.spawn.Rslaves(nslaves=nWorkers)
>>> mpiRank <- function(i)
>>> c(i=i, rank=mpi.comm.rank())
>>> mpi.parSapply(seq_len(2*nWorkers), mpiRank)
>>> mpi.close.Rslaves()
>>> mpi.quit()
>>>
>>> can be run like the comment suggests
>>>
>>> salloc -n 12 orterun -n 1 R -f spawn.R
>>>
>>> uses slurm (or whatever job manager) to allocate resources for 12 tasks
>>> and spawn within that allocation. Maybe that's 'good enough' -- spawning
>>> within the assigned allocation? Likely this requires minimal modification
>>> of the current code.
>>>
>>> More extensive is to revise the manager/worker-style code to something
>>> more like single instruction, multiple data
>>>
>>>
>>> simd.R
>>> ======
>>> ## salloc -n 4 orterun R --slave -f simd.R
>>> sink("/dev/null") # don't capture output -- more care needed here
>>> library(Rmpi)
>>>
>>> TAGS = list(FROM_WORKER=1L)
>>> .comm = 0L
>>>
>>> ## shared `work', here just determine rank and host
>>> work = c(rank=mpi.comm.rank(.comm),
>>> host=system("hostname", intern=TRUE))
>>>
>>> if (mpi.comm.rank(.comm) == 0) {
>>> ## manager
>>> mpi.barrier(.comm)
>>> nWorkers = mpi.comm.size(.comm)
>>> res = list(nWorkers)
>>> for (i in seq_len(nWorkers - 1L)) {
>>> res[[i]] <- mpi.recv.Robj(mpi.any.source(), TAGS$FROM_WORKER,
>>> comm=.comm)
>>> }
>>> res[[nWorkers]] = work
>>> sink() # start capturing output
>>> print(do.call(rbind, res))
>>> } else {
>>> ## worker
>>> mpi.barrier(.comm)
>>> mpi.send.Robj(work, 0L, TAGS$FROM_WORKER, comm=.comm)
>>> }
>>> mpi.quit()
>>>
>>> but this likely requires some serious code revision; if going this route
>>> then
>>> http://r-pbd.org/ might be helpful (and from a similar HPC environment).
>>>
>>> It's always worth asking whether the code is written to be efficient in R
>>> -- a
>>> typical 'mistake' is to write R-level explicit 'for' loops that
>>> "copy-and-append" results, along the lines of
>>>
>>> len <- 100000
>>> result <- NULL
>>> for (i in seq_len(len))
>>> ## some complicated calculation, then...
>>> result <- c(result, sqrt(i))
>>>
>>> whereas it's much better to "pre-allocate and fill"
>>>
>>> result <- integer(len)
>>> for (i in seq_len(len))
>>> result[[i]] = sqrt(i)
>>>
>>> or
>>>
>>> lapply(seq_len(len), sqrt)
>>>
>>> and very much better still to 'vectorize'
>>>
>>> result <- sqrt(seq_len(len))
>>>
>>> (timing for me are about 1 minute for "copy-and-append", .2 s for
>>> "pre-allocate
>>> and fill", and .002s for "vectorize").
>>>
>>> Pushing back on the guy providing the code (grep for "for" loops, and look
>>> for
>>> that copy-and-append pattern) might save you from having to use parallel
>>> evaluation at all.
>>>
>>> Martin
>>>
>>>>
>>>> Thanks,
>>>> Jim
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa Phone: 319-335-3386
Department of Statistics and Fax: 319-335-3017
Actuarial Science
241 Schaeffer Hall email: luke-tierney at uiowa.edu
Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu
More information about the R-help
mailing list