[R-sig-hpc] what causes this fork warning from openmpi
Paul Johnson
pauljohn32 at gmail.com
Tue Sep 28 00:43:08 CEST 2010
Hi, I wonder if you have seen this message from openmpi-1.4.1 and R-2.11.1.
I have built a small sample program that causes this error/warning
every time. In fact, I still get this error/warning if I only just
ask Rmpi to spawn the slaves.
I see the following in the "e" file that is created automatically.
This used to be an intermittent thing that I thought was caused by
trying to use multicore, but now I see it every time I run the
example.
--------------------------------------------------------------------------
An MPI process has executed an operation involving a call to the
"fork()" system call to create a child process. Open MPI is currently
operating in a condition that could result in memory corruption or
other system errors; your MPI job may hang, crash, or produce silent
data corruption. The use of fork() (or system() or other calls that
create child processes) is strongly discouraged.
The process that invoked fork was:
Local host: compute-2-19.local (PID 10000)
MPI_COMM_WORLD rank: 0
If you are *absolutely sure* that your application will successfully
and correctly survive a call to fork(), you may disable this warning
by setting the mpi_warn_on_fork MCA parameter to 0.
--------------------------------------------------------------------------
I have Centos Linux ("Rocks" cluster) with openmpi-1.4.1
> sessionInfo()
R version 2.11.1 (2010-05-31)
x86_64-redhat-linux-gnu
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
Small working example that causes this problem:
Submission script
======== sub-test.sh=============================
$ cat sub-test.sh
#!/bin/sh
#
#This is a submission script to batch out the full sim
#
#These commands set up the Grid Environment for your job:
#PBS -N MpiParallel
#PBS -l nodes=12:ppn=4
#PBS -l walltime=480:00:00
#PBS -M pauljohnku.edu
#PBS -m bea
cd $PBS_O_WORKDIR
orterun --hostfile $PBS_NODEFILE -n 1 R --no-save --vanilla -f mi-test.R
====================================================
=========mi-test.R==========================
if (!is.loaded("mpi_intitialize")){
library(Rmpi)
}
## see http://math.acadiau.ca/ACMMaC/Rmpi/sample.html
### Try worker processes;
mpi.spawn.Rslaves(nslaves=8)
# In case R exits unexpectedly, have it automatically clean up
# resources taken up by Rmpi (slaves, memory, etc...)
.Last <- function(){
if (is.loaded("mpi_initialize")){
if (mpi.comm.size(1) > 0){
print("Please use mpi.close.Rslaves() to close slaves.")
mpi.close.Rslaves()
}
print("Please use mpi.quit() to quit R")
.Call("mpi_finalize")
}
}
### here's where I used to have mpi commands :)
mpi.close.Rslaves()
mpi.quit()
====================================================
See? I've cut out everything except the spawn command.
In the output file, I see no trouble in that part
> mpi.spawn.Rslaves(nslaves=8)
8 slaves are spawned successfully. 0 failed.
master (rank 0, comm 1) of size 9 is running on: compute-2-19
slave1 (rank 1, comm 1) of size 9 is running on: compute-2-19
slave2 (rank 2, comm 1) of size 9 is running on: compute-2-19
slave3 (rank 3, comm 1) of size 9 is running on: compute-2-19
slave4 (rank 4, comm 1) of size 9 is running on: compute-2-18
slave5 (rank 5, comm 1) of size 9 is running on: compute-2-18
slave6 (rank 6, comm 1) of size 9 is running on: compute-2-18
slave7 (rank 7, comm 1) of size 9 is running on: compute-2-18
slave8 (rank 8, comm 1) of size 9 is running on: compute-2-17
I promise, I've run Rmpi lots of times, and I've never had such a simple bit
of code cause the fork error/warning.
pj
--
Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas
More information about the R-sig-hpc
mailing list