[R-sig-hpc] Error message from MPI

Paul Johnson pauljohn32 at gmail.com
Fri Aug 12 17:51:10 CEST 2011

Hello, Jim

I have seen similar, but the jobs still run.  On my system, it happens
because our cluster has a mixture of fabrics.  Some are regular old
ethernet, some are infiniband. (suspect you are similar from openib
message).  When I submit jobs on the cluster, and they are sent to
nodes that are ethernet connected, the infiniband connector tries to
connect, it can't hook up, and then it  falls back.

Would you mind posting your submission script and the R code? Or post a link?

My R-MPI collection is growing. here are instructions on how you can
see.  http://web.ku.edu/~quant/cgi-bin/mw1/index.php?title=Cluster:Main

On Thu, Aug 11, 2011 at 6:40 AM, Jim Maas <j.maas at uea.ac.uk> wrote:
> Hi All,
> I'm relatively new to this but it has worked well for me previously and
> now I'm getting errors.
> I'm attempting to run and R job, on a cluster using LSF operating
> system, using the packages doMPI and foreach.  In the job file I've
> requested 128 slots, and MPI gets that number quite successfully but is
> giving this error message
> =====================
> Loading required package: Rmpi
> Loading required package: Rmpi
> --------------------------------------------------------------------------
> [[51610,1],5]: A high-performance Open MPI point-to-point messaging
> module
> was unable to find any relevant network interfaces:
> Module: OpenFabrics (openib)
>  Host: cn004.private.dns.zone
> Another transport will be used instead, although this may result in
> lower performance.
> =====================
> As a test I've run the same job with a smaller number of slots, and it
> runs fine when I run it on 32 or 64 slots, but when I increase to 128
> slots, I get this error?
> I guess I'm asking if the R packages doMPI and foreach are scalable, and
> to what level?


Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas

More information about the R-sig-hpc mailing list