[R-sig-hpc] mpi_comm_spawn error with Rmpi and snow on SGI Altix
Gad Abraham
gabraham at csse.unimelb.edu.au
Fri Feb 6 02:20:49 CET 2009
Stefan Theussl wrote:
> Hi Gad,
>
> Presumably you have installed an old implementation of MPI.
> mpi.comm.spawn has been supported since the MPI 1.2 standard as far as I
> remember. Can you pls tell us which implementation/version of MPI you
> use? (in case of LAM you may send us the output of 'laminfo')
Hi Stefan & Martin,
I'm merging my offline conversation with Martin.
Details of the system:
SGI Altix 3700Bx2, SUSE enterprise server 10 SP1
/usr/lib/libmpi.so comes from the SGI package sgi-mpt-1.21-sgi601r1
which according to its release notes supports some MPI2 features like
MPI_Comm_spawn.
Martin suggested to check if the symbols are in the library, and that
perhaps Rmpi isn't configuring itself correctly with -DMPI2:
> nm /usr/lib/libmpi.so | grep comm_spawn
0000000000099180 W mpi_comm_spawn_
0000000000099180 W mpi_comm_spawn__
000000000009b390 W mpi_comm_spawn_multiple_
000000000009b390 W mpi_comm_spawn_multiple__
0000000000164dc0 T MPI_SGI_comm_spawn_request
0000000000099180 T pmpi_comm_spawn_
0000000000099180 W pmpi_comm_spawn__
000000000009b390 T pmpi_comm_spawn_multiple_
000000000009b390 W pmpi_comm_spawn_multiple__
Here's sample output from R CMD INSTALL Rmpi, -DMPI2 indeed not set:
gcc -std=gnu99 -I/home/gabraham/dmf/Software/R-2.8.1/include
-DPACKAGE_NAME=\"\" -DPACKAGE_TARNAME=\"\" -DPACKAGE_VERSION=\"\"
-DPACKAGE_STRING=\"\" -DPACKAGE_BUGREPORT=\"\" -I/usr/include -DUNKNOWN
-fPIC -I/usr/local/include -fpic -g -O2 -c conversion.c -o conversion.o
If I recompile Rmpi with either MPI_DEPS="-DMPI2" R CMD INSTALL Rmpi, or
by setting MPI_DEPS="-DMPI2" in Rmpi/configure.ac then autoconf then R
CMD INSTALL, as suggested by Martin, then -DMPI2 is set:
gcc -std=gnu99 -I/home/gabraham/dmf/Software/R-2.8.1/include
-DPACKAGE_NAME=\"\" -DPACKAGE_TARNAME=\"\" -DPACKAGE_VERSION=\"\"
-DPACKAGE_STRING=\"\" -DPACKAGE_BUGREPORT=\"\" -I/usr/include -DMPI2
-DUNKNOWN -fPIC -I/usr/local/include -fpic -g -O2 -c conversion.c -o
conversion.o
but I get a different error when I run the example code:
Error calling job_getjid(): No such file or directory
Error in mpi.comm.spawn(slave = mpitask, slavearg = args, nslaves =
count, :
Error during spawn request
Calls: makeCluster ... switch -> makeMPIcluster -> mpi.comm.spawn -> .Call
Execution halted
MPI: MPI_COMM_WORLD rank 0 has terminated without calling MPI_Finalize()
MPI: aborting job
Thanks,
Gad
--
Gad Abraham
Dept. CSSE and NICTA
The University of Melbourne
Parkville 3010, Victoria, Australia
email: gabraham at csse.unimelb.edu.au
web: http://www.csse.unimelb.edu.au/~gabraham
More information about the R-sig-hpc
mailing list