[R] R and Openmpi
Marce
marcemb at gmail.com
Mon Jun 2 12:29:06 CEST 2008
2008/5/31 Dirk Eddelbuettel <edd en debian.org>:
>
> Paul,
>
> On 30 May 2008 at 15:47, Paul Hewson wrote:
> | Hello,
> |
> | We have R working with Rmpi/openmpi, but I'm a little worried. Specifically, (a) the -np flag doesn't seem to override the hostfile (it works fine with fortran hello world) and (b) I appear to have twice as many processes running as I think I should.
> |
> | Rmpi version 0.5.5
> | Openmpi version 1.1
>
> That's old. Open MPI 1.2.* fixed and changed a lot of things. I am happy with
> 1.2.6, the default on Debian.
>
> | Viglen HPC with (effectively) 9 blades and 8 nodes on each blade.
> | myhosts file contains details of the 9 blades, but specifies that there are 4 slots on each blade (to make sure I leave room for other users).
> |
> | When running mpirun -bynode -np 2 -hostfile myhosts R --slave --vanilla task_pull.R
> |
> | 1. I get as many R slaves as there slots defined in my myhosts file (there are 36 slots defined, and I get 36 slaves, regardless of the setting of -np, the master goes on the first machine in the myhosts file.
> | 2. The .Rout file confirms that I have 1 comm with 1 master and 36 slaves
> | 3. When I top each blade it indicates that there are in fact 8 processes running on each blade and
> | 4. When I pstree each blade it indicates that there are two orted processes, each with 4 subprocesses.
>
> You never showed us task_pull.R ... And as I readily acknowledge that this
> can be tricky, why don't you experiment with simple setting?. Consider this
> token littler [1] invocation (or use Rscript if you prefer / have only that):
>
> edd en ron:~> r -e'library(Rmpi); cat("Hello rank", mpi.comm.rank(0), "size", mpi.comm.size(0), "on", mpi.get.processor.name(), "\n")'
> Hello rank 0 size 1 on ron
> edd en ron:~>
>
> So without an outer mpirun (or orterun as the Open MPI group now calls it) we
> get one instance. Makes sense.
>
> Now with two hosts defined on the fly, and two instances each:
>
> edd en ron:~> orterun -n 4 -H ron,joe r -e'library(Rmpi); cat("Hello rank", mpi.comm.rank(0), "size", mpi.comm.size(0), "on", mpi.get.processor.name(), "\n")'
> Hello rank 0 size 4 on ron
> Hello rank 2 size 4 on ron
> Hello rank 3 size 4 on joe
> Hello rank 1 size 4 on joe
> edd en ron:~>
>
> Adding '-bynode' and using '-np 4' instead of '-n 4' does not change anything.
>
> | >From the point of view of getting a job done this ***seems*** OK (it's running very quickly), but it doesn't seem quite right - given I'm sharing the machine with other users and so on. Is there something I've missed in the useage of mpirun with R/Rmpi.
>
> I cannot quite determine from what you said here what your objective is.
> What exactly are you trying to do that you are not getting done? Using fewer
> instances? Maybe that is in fact an Open MPI 1.2.* versus 1.1.* issue.
>
> One thing to note is that if you wrap all this in the excellent snow packache
> by Tierney et al, then Open MPI's '-n' can always be one as determine from
> _within_ how many nodes you want:
>
> edd en ron:~> orterun -bynode -np 1 -H ron,joe r -e'library(snow); cl <- makeCluster(4, "MPI"); res <- clusterCall(cl, function() Sys.info()["nodename"]); print(do.call(rbind, res))'
> Loading required package: utils
> Loading required package: Rmpi
> 4 slaves are spawned successfully. 0 failed.
> nodename
> [1,] "joe"
> [2,] "ron"
> [3,] "joe"
> [4,] "ron"
> edd en ron:~>
>
> Note the outer '-n 1' and the inner makeCluster(4, "MPI") to give you 4
> slaves. If you use a larger '-n $N' you will get $N instances each starting
> as many nodes as makeCluster asks for.
>
> Hope this helps, Dirk
>
> [1] Littler can be had via Debian / Ubuntu or from
> http://dirk.eddelbuettel.com/code/littler.html
>
> --
> Three out of two people have difficulties with fractions.
>
> ______________________________________________
> R-help en r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Hi Dirk, now I'm using R and OpenMPI in a cluster. Could you link me
some pages of information about this? I'm interesting about the
installation, all the pages i've seen it's with LAM..
Really i've just installed R and Rmpi, but I have some problems when I
spawn nodes in the same vlan.
Thanks for all
More information about the R-help
mailing list