[R-sig-hpc] Rmpi and cpu usage on slaves

Dirk Eddelbuettel edd at debian.org
Wed Apr 22 13:11:11 CEST 2009


On 21 April 2009 at 16:40, Sean Davis wrote:
| I am running sge6.2, openmpi 1.3.1, and Rmpi 0.5.7 on openSUSE linux.  I can
| start up an arbitrarily-sized cluster using sge, see the appropriate
| universe.size using Rmpi, and start a cluster using mpi.spawn.Rslaves().
| However, it appears that all the slaves then run at 100% cpu on all nodes.
| Even using Rmpi under openmpi with a simple hostfile produces the same
| result.  Any suggestions to figure out what is going on on the slaves?

There is a known issue with Open MPI and blocking which you may be hitting
here.  Upstream Open MPI considers it a feature. But as this has come up a
few times on their mailing list as well, I believe the last word was that it
will go away in a future release.

Hth, Dirk

| Thanks,
| Sean
| 
| 
| > library(Rmpi)
| library(Rmpi)
| > mpi.universe.size()
| mpi.universe.size()
| [1] 24
| > mpi.spawn.Rslaves()
| mpi.spawn.Rslaves()
|         24 slaves are spawned successfully. 0 failed.
| master  (rank 0 , comm 1) of size 25 is running on: Mahfouz
| slave1  (rank 1 , comm 1) of size 25 is running on: Mahfouz
| slave2  (rank 2 , comm 1) of size 25 is running on: Mahfouz
| slave3  (rank 3 , comm 1) of size 25 is running on: Mahfouz
| slave4  (rank 4 , comm 1) of size 25 is running on: Mahfouz
| slave5  (rank 5 , comm 1) of size 25 is running on: Mahfouz
| slave6  (rank 6 , comm 1) of size 25 is running on: Mahfouz
| slave7  (rank 7 , comm 1) of size 25 is running on: Mahfouz
| slave8  (rank 8 , comm 1) of size 25 is running on: Grass
| slave9  (rank 9 , comm 1) of size 25 is running on: Grass
| slave10 (rank 10, comm 1) of size 25 is running on: Grass
| slave11 (rank 11, comm 1) of size 25 is running on: Grass
| slave12 (rank 12, comm 1) of size 25 is running on: Grass
| slave13 (rank 13, comm 1) of size 25 is running on: Grass
| slave14 (rank 14, comm 1) of size 25 is running on: Grass
| slave15 (rank 15, comm 1) of size 25 is running on: Grass
| slave16 (rank 16, comm 1) of size 25 is running on: shakespeare
| slave17 (rank 17, comm 1) of size 25 is running on: shakespeare
| slave18 (rank 18, comm 1) of size 25 is running on: shakespeare
| slave19 (rank 19, comm 1) of size 25 is running on: shakespeare
| slave20 (rank 20, comm 1) of size 25 is running on: shakespeare
| slave21 (rank 21, comm 1) of size 25 is running on: shakespeare
| slave22 (rank 22, comm 1) of size 25 is running on: shakespeare
| slave23 (rank 23, comm 1) of size 25 is running on: shakespeare
| slave24 (rank 24, comm 1) of size 25 is running on: Mahfouz
| > mpi.close.Rslaves()
| mpi.close.Rslaves()
| [1] 1
| 
| > sessionInfo()    # on the master
| R version 2.9.0 Under development (unstable) (2009-02-21 r47969)
| x86_64-unknown-linux-gnu
| 
| locale:
| LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
| 
| attached base packages:
| [1] stats     graphics  grDevices utils     datasets  methods   base
| 
| other attached packages:
| [1] Rmpi_0.5-7
| 
| 	[[alternative HTML version deleted]]
| 
| _______________________________________________
| R-sig-hpc mailing list
| R-sig-hpc at r-project.org
| https://stat.ethz.ch/mailman/listinfo/r-sig-hpc

-- 
Three out of two people have difficulties with fractions.



More information about the R-sig-hpc mailing list