[R] parallel r job on sun gridengine
mark garey
garey at biostat.ucsf.edu
Thu Mar 24 01:27:34 CET 2005
greetings all,
this may be the wrong forum for my problem - if so please advise.
i am addressing this list because of an error i am getting from the snow
library rmpi (i think) after lam has booted the mpi nodes
i have a script (provided by a faculty member - i am not an R user but
have the task
of making it run scucessfully as a batch job on the gridengine) that
runs with success
as an interactive shell script, can be run interactively using qrsh on
a sun gridengine,
but fails when submitted to the gridengine as a batch job. the lam/mpi
nodes boot and
shutdown properly via a parallel environment defined in the gridengine.
where the job falls flat is when the snow RMPInode.sh script is called -
or so it seems. the error generated is:
___
/usr/local/lib/R.framework/Versions/2.0.0/Resources/library/snow/
RMPInode.sh: line 9: 13465 Trace/BPT trap (core dumped)
${RPROG:-R} --vanilla >${OUT:-/dev/null} 2>&1 <<EOF
library(Rmpi)
library(snow)
runMPIslave()
EOF
___
environment is darwin (panther 10.3.8), r version is 2.0.0, gridengine
version is 5.3.
i get the feeling this is not an r problem, but if you used r in batch
mode in a parallel environment
maybe you could point me in the right direction.i also realize that
many factors could contibute to this
error, but to be able to rule out r (or the snow library) would be
helpful.
thanks in advance,
mark+ \ ucsf biostat
--
mark garey
ucsf department of epidemiology and biostatistics
500 parnassus ave, mu420w
san francisco, ca. 94143
415-502-8870
More information about the R-help
mailing list