[R-sig-hpc] difficulty spawning Rslaves

Ramon Diaz-Uriarte rdiaz02 at gmail.com
Mon Dec 28 18:28:56 CET 2009


More along Dirk's comments: we currently have two clusters using LAM,
both Debian systems, one using v. 7.1.2 of LAM's release and the other
7.1.1. In a current Ubuntu-based laptop, things are working with
release 7.1.2.

Best,

R.

On Mon, Dec 28, 2009 at 5:14 PM, Dirk Eddelbuettel <edd at debian.org> wrote:
>
> Allan,
>
> On 23 December 2009 at 16:05, Allan Strand wrote:
> | My setup is on a cluster running 64bit FC.  I have recently broken my
> | install Rmpi (and hence snow) by upgrading some very old versions of R,
> | lam/mpi, Rmpi, and snow (currently installed versions listed at the
> | bottom of this email).  No doubt this is a problem with my Rmpi install,
> | but I'm having trouble seeing it.
> |
> | I cannot seem to spawn more than a single slave (which is spawned on the
> | master node)
> | e.g.:
> |
> |  > mpi.spawn.Rslaves(comm=1,nslaves=1)
> |      1 slaves are spawned successfully. 0 failed.
> | master (rank 0, comm 1) of size 2 is running on: node0
> | slave1 (rank 1, comm 1) of size 2 is running on: node0
> |
> |  > mpi.comm.free(comm=1)
> | [1] 1
> |
> |  > mpi.spawn.Rslaves(comm=1,nslaves=2)
> |      2 slaves are spawned successfully. 0 failed.
> | Error in mpi.intercomm.merge(intercomm, 0, comm) :
> |    MPI_Error_string: process in local group is dead
> |
> | No doubt the answer is contained in the MPI_Error string, but I'm not
> | sure how to interpret it.
> |
> | Thanks,
> | Allan
> | ===================================
> | Versions (all installed locally in my account with directory appropriate
> | ./configure settings)
> |
> | R 2.10.1
> | LAM 7.1.4/MPI 2 C++/ROMIO - Indiana University
>  ^^^^^^^^^^^^^^^^^^^^^^^^^
>
> For what it is worth, a looong time ago (two years? longer?) when I was
> helping Manual to get the Debian OpenMPI packages into and when I was
> transitioning off LAM, I had concluded that the very latest 7.1.X releases of
> LAM were broken for me.  The system was a then-current Ubuntu system with the
> LAM and OpenMPI packages compiled from Debian sources.  Provided I 'frozen'
> LAM at 7.1.2 things would work, the newer ones would not.
>
> So I'd recommend either downgrading to the last LAM that worked for you, or
> rather take the plunge and jump to Open MPI. The 1.3.* series is pretty
> already, and 1.4.0 is just around the corner.
>
> Just my $0.02. The problem may of course be entirely different.
>
> Dirk
>
> --
> Three out of two people have difficulties with fractions.
>
> _______________________________________________
> R-sig-hpc mailing list
> R-sig-hpc at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
>



-- 
Ramon Diaz-Uriarte
Structural Biology and Biocomputing Programme
Spanish National Cancer Centre (CNIO)
http://ligarto.org/rdiaz
Phone: +34-91-732-8000 ext. 3019



More information about the R-sig-hpc mailing list