[Rd] Rmpi_0.5-4 and OpenMPI questions

Luke Tierney luke at stat.uiowa.edu
Thu Oct 4 18:02:03 CEST 2007


On Thu, 4 Oct 2007, Hao Yu wrote:

> Hi Dirk,
>
> Thank for pointing out additional flags needed in order to compile Rmpi
> correctly. Those flags can be added in configure.ac once openmpi dir is
> detected. BTW -DMPI2 flag was missed in your Rmpi since the detection of
> openmpi was not good. It should be
> ####
>        if test -d  ${MPI_ROOT}/lib/openmpi; then
>                echo "Found openmpi dir in ${MPI_ROOT}/lib"
>                MPI_DEPS="-DMPI2"
>        fi
> ####
>
> I tried to run Rmpi under snow and got the same error messenger. But after
> checking makeMPIcluster, I found that n=3 was a wrong argument. After
> makeMPIcluster finds that count is missing,
> count=mpi.comm.size(0)-1 is used. If you start R alone, this will return
> count=0 since there is only one member (master). I do not know why snow
> did not use count=mpi.universe.size()-1 to find total nodes available.

The bit of code you are looking at, for handling calls with no count
argument, is for the case where workers have been started by mpirun
with the RMPISNOW script rather than spawing.  Using
mpi.universe.size() to guess a reasonable default choice for the
spawning case might be useful -- will look into that.

I have OpenMPI installed on Fedora 7 x84_64.  Rmpi 0.5-4 configure
fails for me -- it does not find mpi.h.  I can get Rmpi to build if I
manually set these in Makevars:

     PKG_CFLAGS   = $(ARCHCFLAGS) -I/usr/include/openmpi
     PKG_LIBS     = -L/usr/lib64/openmpi -L/lib -lmpi -lpthread -fPIC $(ARCHLIB)

When I try to use R -> library(Rmpi) -> library(snow) cl <- makeMPIcluster(2)
or mpirun -np3 R -> library(Rmpi) -> ... I get

     Signal:11 info.si_errno:0(Success) si_code:1(SEGV_MAPERR)
     Failing at addr:0x1d26ab7
     [0] func:/usr/lib64/openmpi/libopal.so.0 [0x2aaaafee3263]
     [1] func:/lib64/libc.so.6 [0x367f030630]
     [2] func:/usr/lib64/openmpi/libmpi.so.0(ompi_fortran_string_f2c+0x8c) [0x2aaaaf813bcc]
     [3] func:/usr/lib64/openmpi/libmpi.so.0(mpi_comm_spawn_f+0x75) [0x2aaaaf816405]
     ...
     *** End of error message ***
     Segmentation fault
     luke at nokomis ~%

But mpirun -np 3 RMPISNOW does seem to work, more or less.  A modified
version of RMPISNOW, hopefully attached, does a better job of getting
sensible arguments to the workers and master, but the master R still
thinks it is non-interactive.  I have not figured out a work-around
for that yet -- suggestions welcome.

Best,

luke

> Anyway after using
> cl=makeMPIcluster(count=3),
> I was able to run parApply function.
>
> I tried
> R -> library(Rmpi) -> library(snow) -> c1=makeMPIcluster(3)
>
> Also
> mpirun -host hostfile -np 1 R --no-save
> library(Rmpi) -> library(snow) -> c1=makeMPIcluster(3)
>
> Hao
>
> PS: hostfile contains all nodes info so in R mpi.universe.size() returns
> right number and will spawn to remote nodes.
>
> Rmp under Debian 3.1 and openmpi 1.2.4 seems OK. I did find some missing
> lib under Debian 4.0.
>
>
> Dirk Eddelbuettel wrote:
>>
>> Many thanks to Dr Yu for updating Rmpi for R 2.6.0, and for starting to
>> make
>> the changes to support Open MPI.
>>
>> I have just built the updated Debian package of Rmpi (i.e. r-cran-rmpi)
>> under
>> R 2.6.0 but I cannot convince myself yet whether it works or not.  Simple
>> tests work.  E.g. on my Debian testing box, with Rmpi installed directly
>> using Open Mpi 1.2.3-2 (from Debian) and using 'r' from littler:
>>
>> edd at ron:~> orterun -np 3 r -e 'library(Rmpi); print(mpi.comm.rank(0))'
>> [1] 0
>> [1] 1
>> [1] 2
>> edd at ron:~>
>>
>> but I basically cannot get anything more complicated to work yet.  R /
>> Rmpi
>> just seem to hang, in particular snow and and getMPIcluster() just sit
>> there:
>>
>>> cl <- makeSOCKcluster(c("localhost", "localhost"))
>>> stopCluster(cl)
>>> library(Rmpi)
>>> cl <- makeMPIcluster(n=3)
>> Error in makeMPIcluster(n = 3) : no nodes available.
>>>
>>
>> I may be overlooking something simple here, in particular the launching of
>> apps appears to be different for Open MPI than it was with LAM/MPI (or
>> maybe
>> I am just confused because I also look at LLNL's slurm for use with Open
>> MPI ?)
>>
>> Has anybody gotten Open MPI and Rmpi to work on simple demos?  Similarly,
>> is
>> anybody using snow with Rmpi and Open MPI yet?
>>
>> Also, the Open MPI FAQ is pretty clear on their preference for using mpicc
>> for compiling/linking to keep control of the compiler and linker options
>> and
>> switches.  Note that e.g. on my Debian system
>>
>> edd at ron:~> mpicc --showme:link
>> -pthread -lmpi -lopen-rte -lopen-pal -ldl -Wl,--export-dynamic -lnsl
>> -lutil -lm -ldl
>>
>> whereas Rmpi built with just the default from R CMD:
>>
>> gcc-4.2 -std=gnu99 -shared  -o Rmpi.so RegQuery.o Rmpi.o conversion.o
>> internal.o -L/usr/lib -lmpi -lpthread -fPIC   -L/usr/lib/R/lib -lR
>>
>> Don't we need libopen-rte and libopen-pal as the MPI FAQ suggests?
>>
>> Many thanks, Dirk
>>
>> --
>> Three out of two people have difficulties with fractions.
>>
>
>
>

-- 
Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:      luke at stat.uiowa.edu
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu
-------------- next part --------------
#! /bin/sh

# if defined, prepend R_SNOW_LIB to $_LIBS
if test ! -z "${R_SNOW_LIB}" ; then
    R_LIBS=${R_SNOW_LIB}:${R_LIBS}; export R_LIBS
fi

# find the library containing the snow package; should eventually use Rscript
snowdir=`echo 'invisible(cat(tryCatch(dirname(.find.package("snow")), error = function(e) ""),"\n",sep=""))' | R --slave`

# for now this hijacks the R_PROFILE mechanism to start up the R
# sessions and load snow and Rmpi into them
R_PROFILE=${snowdir}/snow/RMPISNOWprofile; export R_PROFILE

if test ! -z "${LAMRANK}" ; then
    # use the LAMRANK environment variable set by LAM-MPI's mpirun to
    # run R with appropriate arguments for master and workers.
    if test "${LAMRANK}" == "0" ; then
	exec R $*
    else
	exec R --slave > /dev/null 2>&1
    fi
elif test ! -z "${OMPI_MCA_ns_nds_vpid}" ; then
    # Similar approach for OpenMPI using the OMPI_MCA_ns_nds_vpid
    # variable.  Don't know if it might be better to use
    # OMPI_MCA_ns_nds_vpid_start instead.  The master R process thinks
    # it is non-interactive so for now --no-save or something like
    # that is needed.
    if test "${OMPI_MCA_ns_nds_vpid}" == "0" ; then
	exec R --no-save $*
    else
	exec R --slave > /dev/null 2>&1
    fi
else 
    # The fallback is to use the same arguments on master and workers,
    # with --no-save for cases where workers don't have a terminal.
    # This means that things like CMD batch won't work. It seems to be
    # important NOT to use exec here, at least when this code runs under LAM.
    R --no-save $*
fi


More information about the R-devel mailing list