[R-sig-hpc] Rmpi and cpu usage on slaves
Dirk Eddelbuettel
edd at debian.org
Wed Apr 22 13:11:11 CEST 2009
On 21 April 2009 at 16:40, Sean Davis wrote:
| I am running sge6.2, openmpi 1.3.1, and Rmpi 0.5.7 on openSUSE linux. I can
| start up an arbitrarily-sized cluster using sge, see the appropriate
| universe.size using Rmpi, and start a cluster using mpi.spawn.Rslaves().
| However, it appears that all the slaves then run at 100% cpu on all nodes.
| Even using Rmpi under openmpi with a simple hostfile produces the same
| result. Any suggestions to figure out what is going on on the slaves?
There is a known issue with Open MPI and blocking which you may be hitting
here. Upstream Open MPI considers it a feature. But as this has come up a
few times on their mailing list as well, I believe the last word was that it
will go away in a future release.
Hth, Dirk
| Thanks,
| Sean
|
|
| > library(Rmpi)
| library(Rmpi)
| > mpi.universe.size()
| mpi.universe.size()
| [1] 24
| > mpi.spawn.Rslaves()
| mpi.spawn.Rslaves()
| 24 slaves are spawned successfully. 0 failed.
| master (rank 0 , comm 1) of size 25 is running on: Mahfouz
| slave1 (rank 1 , comm 1) of size 25 is running on: Mahfouz
| slave2 (rank 2 , comm 1) of size 25 is running on: Mahfouz
| slave3 (rank 3 , comm 1) of size 25 is running on: Mahfouz
| slave4 (rank 4 , comm 1) of size 25 is running on: Mahfouz
| slave5 (rank 5 , comm 1) of size 25 is running on: Mahfouz
| slave6 (rank 6 , comm 1) of size 25 is running on: Mahfouz
| slave7 (rank 7 , comm 1) of size 25 is running on: Mahfouz
| slave8 (rank 8 , comm 1) of size 25 is running on: Grass
| slave9 (rank 9 , comm 1) of size 25 is running on: Grass
| slave10 (rank 10, comm 1) of size 25 is running on: Grass
| slave11 (rank 11, comm 1) of size 25 is running on: Grass
| slave12 (rank 12, comm 1) of size 25 is running on: Grass
| slave13 (rank 13, comm 1) of size 25 is running on: Grass
| slave14 (rank 14, comm 1) of size 25 is running on: Grass
| slave15 (rank 15, comm 1) of size 25 is running on: Grass
| slave16 (rank 16, comm 1) of size 25 is running on: shakespeare
| slave17 (rank 17, comm 1) of size 25 is running on: shakespeare
| slave18 (rank 18, comm 1) of size 25 is running on: shakespeare
| slave19 (rank 19, comm 1) of size 25 is running on: shakespeare
| slave20 (rank 20, comm 1) of size 25 is running on: shakespeare
| slave21 (rank 21, comm 1) of size 25 is running on: shakespeare
| slave22 (rank 22, comm 1) of size 25 is running on: shakespeare
| slave23 (rank 23, comm 1) of size 25 is running on: shakespeare
| slave24 (rank 24, comm 1) of size 25 is running on: Mahfouz
| > mpi.close.Rslaves()
| mpi.close.Rslaves()
| [1] 1
|
| > sessionInfo() # on the master
| R version 2.9.0 Under development (unstable) (2009-02-21 r47969)
| x86_64-unknown-linux-gnu
|
| locale:
| LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
|
| attached base packages:
| [1] stats graphics grDevices utils datasets methods base
|
| other attached packages:
| [1] Rmpi_0.5-7
|
| [[alternative HTML version deleted]]
|
| _______________________________________________
| R-sig-hpc mailing list
| R-sig-hpc at r-project.org
| https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
--
Three out of two people have difficulties with fractions.
More information about the R-sig-hpc
mailing list