[R-sig-hpc] Rmpi working with OpenMPI and PBSPro but snow fails

luke at stat.uiowa.edu luke at stat.uiowa.edu
Wed Mar 4 15:52:46 CET 2009


mpiexec -n 3 RMPISNOW -f snowtest_solo.r

works for me with OpenMPI (openmpi-1.2.4-2.fc9.x86_64) and current snow.

The RMPISNOW does try to identify the master to adjust the arguments
but that shouldn't cause confusion about who is the master -- that is
based on the rank.  It may be that your profile file setting you
mentioned is gettng inthe way as RMPISNOW uses the R_PROFILE
environment variable to get the top level code into the processes.

luke




On Wed, 4 Mar 2009, Huw Lynes wrote:

> On Wed, 2009-03-04 at 07:49 -0600, luke at stat.uiowa.edu wrote:
>> On Wed, 4 Mar 2009, Huw Lynes wrote:
>>
>>>
>
> Hi Luke,
>
> Thanks for the quick response.
>
>>> Moving onto snow in the same environment trying to setup by using
>>> getMPICluster() returns an error in checkCluster() saying that there is
>>> something wrong with the cluster.
>>
>> I don't know what "Moving to snow" means exactly as you don't give
>> details of you you are starting things up so I have to guess.  If you
>> are using mpiexec then you need to run snow via the RMPISNOW shell
>> script, which for NPROCS sets up a master and a cluster with NPROCS -
>> 1 workers, and then use
>>
>> cl <- makeCluster()
>>
>> to access the already running cluster.
>>
>
> If I take the following trivial R script:
>
> ------------------------------------------------------------------------
>
> library(Rmpi)
> library(snow)
>
> cl <- makeCluster()
> clusterCall(cl, function() Sys.info()[c("nodename","machine")])
> stopCluster(cl)
> ------------------------------------------------------------------------
>
> and run it as
> ------------------------------------------------------------------------
> #!/bin/bash
> #PBS -q SMP_queue
> #PBS -l select=1:ncpus=4:mpiprocs=4
> #PBS -l place=scatter:excl
>
>
> module load apps/R
> module load libs/R-mpi
>
> cd $PBS_O_WORKDIR
> cat $PBS_NODEFILE
>
> mpiexec RMPISNOW -f snowtest_solo.r
> -----------------------------------------------------------------------
>
> all the R processes just sit there spinning rather than doing anything
> useful and I have to kill the job.
>
> the suggestion in this mail:
> https://stat.ethz.ch/pipermail/r-sig-hpc/2009-January/000069.html
>
> results in the same problem of R spinning. I suspect that there is
> something different about my OpenMPI setup that means snow is failing to
> set up a master process. So you end up with all four processes as slaves
> spinning on a network poll.
>
>

-- 
Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:      luke at stat.uiowa.edu
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu



More information about the R-sig-hpc mailing list