[R-sig-hpc] simple question on R/Rmpi/snow/slurm configuration

Martin Morgan mtmorgan at fhcrc.org
Mon Jan 5 21:46:35 CET 2009


"Whit Armstrong" <armstrong.whit at gmail.com> writes:

> I'm attempting to get Dirk's example from the "intro to HCP with R"
> talk working (http://dirk.eddelbuettel.com/papers/bocDec2008introHPCwithR.pdf).
>
> I have slurm working correctly (all the trivial hostname examples
> complete successfully).
>
> I fire up an R sesssion w/ the following command
>
> salloc orterun -n 7 R --vanilla

I think you want to salloc your universe, and then run R on one node
of the universe

salloc -n 7 orterun -np 1 R --vanilla

then

> library(Rmpi)
> mpi.universe.size()

will report 7.

Martin

> and then run
> suppressMessages(library(Rmpi))
>
> but my console never returns control.
>
> it's just frozen until I <control-c> out of it at which point I get
> this message:
>> suppressMessages(library(Rmpi))
> [linuxsvr.kls.corp:05875] mca: base: component_find: unable to open
> osc pt2pt: file not found (ignored)
> orterun: killing job...
>
> orterun noticed that job rank 0 with PID 5875 on node node0 exited on
> signal 15 (Terminated).
> salloc: Relinquishing job allocation 70
> [warmstrong at linuxsvr ~]$
>
> meanwhile squeue shows:
>
> [warmstrong at linuxsvr ~]$ squeue
>   JOBID PARTITION     NAME     USER  ST       TIME  NODES NODELIST(REASON)
>      71      prod  orterun warmstro   R       0:31      1 node0
> [warmstrong at linuxsvr ~]$
>
>
> Have I missed something crucial?  Should I only be running these
> examples in batch mode or with littler?
>
> Thanks in advance,
> Whit
>
> _______________________________________________
> R-sig-hpc mailing list
> R-sig-hpc at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc

-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M2 B169
Phone: (206) 667-2793



More information about the R-sig-hpc mailing list