[R-sig-hpc] 'Rmpi' issue

Paul Johnson pauljohn32 at gmail.com
Mon Apr 30 08:24:21 CEST 2012


Greetings,

You'll have to spend more time studying the basics of creating a
cluster inside R and talking to its nodes.


Your R program is not complete enough to create a cluster and talk to
it, try some examples I have here for working code.

http://web.ku.edu/~quant/cgi-bin/mw1/index.php?title=Cluster:Main#R_Packages_and_Parallel_Computing

Go to the hpcexample archive, something like:

http://winstat.quant.ku.edu/svn/hpcexample/trunk/Ex53-HelloWorldRmpi

or
http://winstat.quant.ku.edu/svn/hpcexample/trunk/Ex60-HelloWorldSnow


You appear to have a very ancient R, if 2.11 is what I see there.  I'd
not bother with such an old setup. Do  you even know for sure if Rmpi
is working there?  Can't your system admin give you a working example?

You will probably need to know more about your cluster, I don't have
time to study up on that for you.  The error messages are suspiciously
like an MS Windows system.  That would be surprising.  On a Linux
cluster, I've never seen a dynLoad message like you have.



On Sun, Apr 29, 2012 at 11:07 PM, Libo Sun <libosun at rams.colostate.edu> wrote:
> Hi,
>
> I am trying to test 'Rmpi' on Blacklight,
> http://www.psc.edu/machines/sgi/uv/blacklight.php, in PSC.
>
> The R code I am trying to run is very simple:
>
> mpi.remote.exec(paste("I am",mpi.comm.rank(),"of",mpi.comm.size()))
> mpi.remote.exec(paste("I am",Sys.info()1,"of",mpi.comm.size()))
>
> The batch file:
>
> #!/bin/csh
> #PBS -l walltime=5:00
> #PBS -l ncpus=16
> #PBS -o test.out
> #PBS -j oe
> #PBS -q debug
>
> # define module command
> source /usr/share/modules/init/csh
>
> set echo
> date
>
> # load R module
> module load R/2.11.1
> module load Rmpi
> module swap mpt/2.04 mpt/2.01
>
> # move to scratch space to run job
> cd $SCRATCH
>
> # copy input file to scratch space
> cp $HOME/test.R .
> cp $HOME/.Rprofile .
>
> mpirun -np 16 R --no-save -q < test.R > test.Rout
>
> However, here are some error messages I got:
>
>  *** caught segfault ***
> address (nil), cause 'memory not mapped'
>
> Traceback:
>  1: dyn.load(file, DLLpath = DLLpath, ...)
>  2: library.dynam("Rmpi", pkg, lib)
>  3: f(libname, pkgname)
>  4: firstlib(which.lib.loc, package)
>  5: doTryCatch(return(expr), name, parentenv, handler)
>  6: tryCatchOne(expr, names, parentenv, handlers[[1L]])
>  7: tryCatchList(expr, classes, parentenv, handlers)
>  8: tryCatch(expr, error = function(e) {    call <- conditionCall(e)    if
> (!is.null(call)) {        if (identical(call[[1L]],
> quote(doTryCatch)))             call <- sys.call(-4L)        dcall <-
> deparse(call)[1L]        prefix <- paste("Error in", dcall, ": ")
> LONG <- 75L        msg <- conditionMessage(e)        sm <- strsplit(msg,
> "\n")[[1L]]        w <- 14L + nchar(dcall, type = "w") + nchar(sm[1L], type
> = "w")        if (is.na(w))             w <- 14L + nchar(dcall, type = "b")
> + nchar(sm[1L],                 type = "b")        if (w >
> LONG)             prefix <- paste(prefix, "\n  ", sep = "")    }    else
> prefix <- "Error : "    msg <- paste(prefix, conditionMessage(e), "\n", sep
> = "")    .Internal(seterrmessage(msg[1L]))    if (!silent &&
> identical(getOption("show.error.messages"),         TRUE)) {
> cat(msg, file = stderr())        .Internal(printDeferredWarnings())    }
> invisible(structure(msg, class = "try-error"))})
>  9: try(firstlib(which.lib.loc, package))
> 10: library(Rmpi, logical.return = TRUE)
> aborting ...
>
> Any idea how to fix this?
>
> Many thanks,
> Libo
>
>        [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-hpc mailing list
> R-sig-hpc at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc



-- 
Paul E. Johnson
Professor, Political Science    Assoc. Director
1541 Lilac Lane, Room 504     Center for Research Methods
University of Kansas               University of Kansas
http://pj.freefaculty.org            http://quant.ku.edu



More information about the R-sig-hpc mailing list