[R-sig-hpc] mpi_comm_spawn error with Rmpi and snow on SGI Altix

Gad Abraham gabraham at csse.unimelb.edu.au
Fri Feb 6 02:20:49 CET 2009


Stefan Theussl wrote:
> Hi Gad,
> 
> Presumably you have installed an old implementation of MPI. 
> mpi.comm.spawn has been supported since the MPI 1.2 standard as far as I 
> remember. Can you pls tell us which implementation/version of MPI you 
> use? (in case of LAM you may send us the output of 'laminfo')

Hi Stefan & Martin,

I'm merging my offline conversation with Martin.

Details of the system:
SGI Altix 3700Bx2, SUSE enterprise server 10 SP1
/usr/lib/libmpi.so comes from the SGI package sgi-mpt-1.21-sgi601r1 
which according to its release notes supports some MPI2 features like 
MPI_Comm_spawn.

Martin suggested to check if the symbols are in the library, and that 
perhaps Rmpi isn't configuring itself correctly with -DMPI2:

 > nm /usr/lib/libmpi.so | grep comm_spawn
0000000000099180 W mpi_comm_spawn_
0000000000099180 W mpi_comm_spawn__
000000000009b390 W mpi_comm_spawn_multiple_
000000000009b390 W mpi_comm_spawn_multiple__
0000000000164dc0 T MPI_SGI_comm_spawn_request
0000000000099180 T pmpi_comm_spawn_
0000000000099180 W pmpi_comm_spawn__
000000000009b390 T pmpi_comm_spawn_multiple_
000000000009b390 W pmpi_comm_spawn_multiple__


Here's sample output from R CMD INSTALL Rmpi, -DMPI2 indeed not set:

gcc -std=gnu99 -I/home/gabraham/dmf/Software/R-2.8.1/include 
-DPACKAGE_NAME=\"\" -DPACKAGE_TARNAME=\"\" -DPACKAGE_VERSION=\"\" 
-DPACKAGE_STRING=\"\" -DPACKAGE_BUGREPORT=\"\" -I/usr/include  -DUNKNOWN 
-fPIC -I/usr/local/include    -fpic  -g -O2 -c conversion.c -o conversion.o


If I recompile Rmpi with either MPI_DEPS="-DMPI2" R CMD INSTALL Rmpi, or 
by setting MPI_DEPS="-DMPI2" in Rmpi/configure.ac then autoconf then R 
CMD INSTALL, as suggested by Martin, then -DMPI2 is set:

gcc -std=gnu99 -I/home/gabraham/dmf/Software/R-2.8.1/include 
-DPACKAGE_NAME=\"\" -DPACKAGE_TARNAME=\"\" -DPACKAGE_VERSION=\"\" 
-DPACKAGE_STRING=\"\" -DPACKAGE_BUGREPORT=\"\"  -I/usr/include -DMPI2 
-DUNKNOWN -fPIC -I/usr/local/include    -fpic  -g -O2 -c conversion.c -o 
conversion.o


but I get a different error when I run the example code:

Error calling job_getjid(): No such file or directory
Error in mpi.comm.spawn(slave = mpitask, slavearg = args, nslaves = 
count,  :
   Error during spawn request
Calls: makeCluster ... switch -> makeMPIcluster -> mpi.comm.spawn -> .Call
Execution halted
MPI: MPI_COMM_WORLD rank 0 has terminated without calling MPI_Finalize()
MPI: aborting job


Thanks,
Gad

-- 
Gad Abraham
Dept. CSSE and NICTA
The University of Melbourne
Parkville 3010, Victoria, Australia
email: gabraham at csse.unimelb.edu.au
web: http://www.csse.unimelb.edu.au/~gabraham



More information about the R-sig-hpc mailing list