[R-sig-hpc] Rmpi: Connection to lifeline [[...,0],0] lost

Sklyar, Oleg (London) osklyar at maninvestments.com
Thu Feb 12 11:17:52 CET 2009


After moving to OpenMPI from LAM, we started to get the following error
messages with Rmpi:

[hpc01:06981] [[11499,1],0] routed:binomial: Connection to lifeline
[[11499,0],0] lost

The main script is a lengthy 3-hour long process that periodically
spawns short 2-minute long jobs onto the nodes. Any ideas what this
could be?

Thanks.

Oleg

--------------------------------------------------

*** R 2.9.0 (svn -r 47821) [/share/research/R-devel/20090203/lib64/R]
***
> library(Rmpi)
> sessionInfo()
R version 2.9.0 Under development (unstable) (2009-02-02 r47821) 
x86_64-unknown-linux-gnu 

locale:
C

attached base packages:
[1] stats     graphics  utils     datasets  grDevices methods   base


other attached packages:
[1] Rmpi_0.5-7

--------------------------------------------------

[hpc01] ~
* ompi_info
                 Package: Open MPI osklyar at hpc01 Distribution
                Open MPI: 1.3
   Open MPI SVN revision: r20295
   Open MPI release date: Jan 19, 2009
                Open RTE: 1.3
   Open RTE SVN revision: r20295
   Open RTE release date: Jan 19, 2009
                    OPAL: 1.3
       OPAL SVN revision: r20295
       OPAL release date: Jan 19, 2009
            Ident string: 1.3
                  Prefix: /share/research/opt/openmpi
 Configured architecture: x86_64-unknown-linux-gnu
          Configure host: hpc01
           Configured by: osklyar
           Configured on: Tue Feb  3 15:04:03 GMT 2009
          Configure host: hpc01
                Built by: osklyar
                Built on: Tue Feb  3 15:19:46 GMT 2009
              Built host: hpc01
              C bindings: yes
            C++ bindings: yes
      Fortran77 bindings: yes (all)
      Fortran90 bindings: yes
 Fortran90 bindings size: small
              C compiler: gcc
     C compiler absolute: /share/research/opt/bin/gcc
            C++ compiler: g++
   C++ compiler absolute: /share/research/opt/bin/g++
      Fortran77 compiler: gfortran
  Fortran77 compiler abs: /share/research/opt/bin/gfortran
      Fortran90 compiler: gfortran
  Fortran90 compiler abs: /share/research/opt/bin/gfortran
             C profiling: yes
           C++ profiling: yes
     Fortran77 profiling: yes
     Fortran90 profiling: yes
          C++ exceptions: no
          Thread support: posix (mpi: no, progress: no)
           Sparse Groups: no
  Internal debug support: no
     MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
         libltdl support: yes
   Heterogeneous support: no
 mpirun default --prefix: no
         MPI I/O support: yes
       MPI_WTIME support: gettimeofday
Symbol visibility support: yes
   FT Checkpoint support: no  (checkpoint thread: no)
           MCA backtrace: execinfo (MCA v2.0, API v2.0, Component v1.3)
              MCA memory: ptmalloc2 (MCA v2.0, API v2.0, Component v1.3)
           MCA paffinity: linux (MCA v2.0, API v2.0, Component v1.3)
               MCA carto: auto_detect (MCA v2.0, API v2.0, Component
v1.3)
               MCA carto: file (MCA v2.0, API v2.0, Component v1.3)
           MCA maffinity: first_use (MCA v2.0, API v2.0, Component v1.3)
               MCA timer: linux (MCA v2.0, API v2.0, Component v1.3)
         MCA installdirs: env (MCA v2.0, API v2.0, Component v1.3)
         MCA installdirs: config (MCA v2.0, API v2.0, Component v1.3)
                 MCA dpm: orte (MCA v2.0, API v2.0, Component v1.3)
              MCA pubsub: orte (MCA v2.0, API v2.0, Component v1.3)
           MCA allocator: basic (MCA v2.0, API v2.0, Component v1.3)
           MCA allocator: bucket (MCA v2.0, API v2.0, Component v1.3)
                MCA coll: basic (MCA v2.0, API v2.0, Component v1.3)
                MCA coll: hierarch (MCA v2.0, API v2.0, Component v1.3)
                MCA coll: inter (MCA v2.0, API v2.0, Component v1.3)
                MCA coll: self (MCA v2.0, API v2.0, Component v1.3)
                MCA coll: sm (MCA v2.0, API v2.0, Component v1.3)
                MCA coll: tuned (MCA v2.0, API v2.0, Component v1.3)
                  MCA io: romio (MCA v2.0, API v2.0, Component v1.3)
               MCA mpool: fake (MCA v2.0, API v2.0, Component v1.3)
               MCA mpool: rdma (MCA v2.0, API v2.0, Component v1.3)
               MCA mpool: sm (MCA v2.0, API v2.0, Component v1.3)
                 MCA pml: cm (MCA v2.0, API v2.0, Component v1.3)
                 MCA pml: ob1 (MCA v2.0, API v2.0, Component v1.3)
                 MCA pml: v (MCA v2.0, API v2.0, Component v1.3)
                 MCA bml: r2 (MCA v2.0, API v2.0, Component v1.3)
              MCA rcache: vma (MCA v2.0, API v2.0, Component v1.3)
                 MCA btl: self (MCA v2.0, API v2.0, Component v1.3)
                 MCA btl: sm (MCA v2.0, API v2.0, Component v1.3)
                 MCA btl: tcp (MCA v2.0, API v2.0, Component v1.3)
                MCA topo: unity (MCA v2.0, API v2.0, Component v1.3)
                 MCA osc: pt2pt (MCA v2.0, API v2.0, Component v1.3)
                 MCA osc: rdma (MCA v2.0, API v2.0, Component v1.3)
                 MCA iof: hnp (MCA v2.0, API v2.0, Component v1.3)
                 MCA iof: orted (MCA v2.0, API v2.0, Component v1.3)
                 MCA iof: tool (MCA v2.0, API v2.0, Component v1.3)
                 MCA oob: tcp (MCA v2.0, API v2.0, Component v1.3)
                MCA odls: default (MCA v2.0, API v2.0, Component v1.3)
                 MCA ras: slurm (MCA v2.0, API v2.0, Component v1.3)
               MCA rmaps: rank_file (MCA v2.0, API v2.0, Component v1.3)
               MCA rmaps: round_robin (MCA v2.0, API v2.0, Component
v1.3)
               MCA rmaps: seq (MCA v2.0, API v2.0, Component v1.3)
                 MCA rml: oob (MCA v2.0, API v2.0, Component v1.3)
              MCA routed: binomial (MCA v2.0, API v2.0, Component v1.3)
              MCA routed: direct (MCA v2.0, API v2.0, Component v1.3)
              MCA routed: linear (MCA v2.0, API v2.0, Component v1.3)
                 MCA plm: rsh (MCA v2.0, API v2.0, Component v1.3)
                 MCA plm: slurm (MCA v2.0, API v2.0, Component v1.3)
               MCA filem: rsh (MCA v2.0, API v2.0, Component v1.3)
              MCA errmgr: default (MCA v2.0, API v2.0, Component v1.3)
                 MCA ess: env (MCA v2.0, API v2.0, Component v1.3)
                 MCA ess: hnp (MCA v2.0, API v2.0, Component v1.3)
                 MCA ess: singleton (MCA v2.0, API v2.0, Component v1.3)
                 MCA ess: slurm (MCA v2.0, API v2.0, Component v1.3)
                 MCA ess: tool (MCA v2.0, API v2.0, Component v1.3)
             MCA grpcomm: bad (MCA v2.0, API v2.0, Component v1.3)
             MCA grpcomm: basic (MCA v2.0, API v2.0, Component v1.3)

Dr Oleg Sklyar
Research Technologist
AHL / Man Investments Ltd
+44 (0)20 7144 3107
osklyar at maninvestments.com

**********************************************************************
Please consider the environment before printing this email or its attachments.
The contents of this email are for the named addressees ...{{dropped:19}}



More information about the R-sig-hpc mailing list