[R-sig-hpc] difficulty spawning Rslaves

Allan Strand stranda at cofc.edu
Tue Dec 29 15:37:44 CET 2009


Thanks Dirk and Ramon.

I tried Lam 7.1.2 and am still seeing the same type of behavior.  Still 
searching for a solution, and will report back.

cheers,
a.

On 12/28/2009 12:28 PM, Ramon Diaz-Uriarte wrote:
> More along Dirk's comments: we currently have two clusters using LAM,
> both Debian systems, one using v. 7.1.2 of LAM's release and the other
> 7.1.1. In a current Ubuntu-based laptop, things are working with
> release 7.1.2.
>
> Best,
>
> R.
>
> On Mon, Dec 28, 2009 at 5:14 PM, Dirk Eddelbuettel<edd at debian.org>  wrote:
>    
>> Allan,
>>
>> On 23 December 2009 at 16:05, Allan Strand wrote:
>> | My setup is on a cluster running 64bit FC.  I have recently broken my
>> | install Rmpi (and hence snow) by upgrading some very old versions of R,
>> | lam/mpi, Rmpi, and snow (currently installed versions listed at the
>> | bottom of this email).  No doubt this is a problem with my Rmpi install,
>> | but I'm having trouble seeing it.
>> |
>> | I cannot seem to spawn more than a single slave (which is spawned on the
>> | master node)
>> | e.g.:
>> |
>> |>  mpi.spawn.Rslaves(comm=1,nslaves=1)
>> |      1 slaves are spawned successfully. 0 failed.
>> | master (rank 0, comm 1) of size 2 is running on: node0
>> | slave1 (rank 1, comm 1) of size 2 is running on: node0
>> |
>> |>  mpi.comm.free(comm=1)
>> | [1] 1
>> |
>> |>  mpi.spawn.Rslaves(comm=1,nslaves=2)
>> |      2 slaves are spawned successfully. 0 failed.
>> | Error in mpi.intercomm.merge(intercomm, 0, comm) :
>> |    MPI_Error_string: process in local group is dead
>> |
>> | No doubt the answer is contained in the MPI_Error string, but I'm not
>> | sure how to interpret it.
>> |
>> | Thanks,
>> | Allan
>> | ===================================
>> | Versions (all installed locally in my account with directory appropriate
>> | ./configure settings)
>> |
>> | R 2.10.1
>> | LAM 7.1.4/MPI 2 C++/ROMIO - Indiana University
>>   ^^^^^^^^^^^^^^^^^^^^^^^^^
>>
>> For what it is worth, a looong time ago (two years? longer?) when I was
>> helping Manual to get the Debian OpenMPI packages into and when I was
>> transitioning off LAM, I had concluded that the very latest 7.1.X releases of
>> LAM were broken for me.  The system was a then-current Ubuntu system with the
>> LAM and OpenMPI packages compiled from Debian sources.  Provided I 'frozen'
>> LAM at 7.1.2 things would work, the newer ones would not.
>>
>> So I'd recommend either downgrading to the last LAM that worked for you, or
>> rather take the plunge and jump to Open MPI. The 1.3.* series is pretty
>> already, and 1.4.0 is just around the corner.
>>
>> Just my $0.02. The problem may of course be entirely different.
>>
>> Dirk
>>
>> --
>> Three out of two people have difficulties with fractions.
>>
>> _______________________________________________
>> R-sig-hpc mailing list
>> R-sig-hpc at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
>>
>>      
>
>
>    

-- 
Allan Strand,   Biology    http://linum.cofc.edu
College of Charleston      Ph. (843) 953-9189
Charleston, SC 29424       Fax (843) 953-9199



More information about the R-sig-hpc mailing list