[R-sig-hpc] Rmpi hanging on slave nodes

Sklyar, Oleg (London) osklyar at maninvestments.com
Mon Feb 2 18:27:24 CET 2009


Dear Hao, dear list:

for some time now I have had strange issues with Rmpi which essentially
result in the head node spawning jobs to slave nodes, those run for a
couple of seconds and then hang and never return back to the head node
(see the 'top' output below). The issue occurs *only* when I run a
function like the one below (simplified) being a part of a package. If I
run the same from the global environment, or if I run the below domaccs
that broadcasts a function being defined in the global environment, then
everything runs through.

Now out of 10 custom packages that I have, the above problem occurs for
just about two if I import them into the calling package. All other do
not have any adverse effect. I was trying to strip off functionality of
the two mentioned packages to avoid name clashes etc, but was not able
to locate any particular function that would lead to clashes -- at some
point if the package is attached, Rmpi stops working. I understand that
this information is very vague, but I was thinking that somebody has had
a similar problem already and knows the answer. The same issue with Rmpi
0.5-5 through to 0.5-7

Will be grateful for any ideas.

Thanks,
Oleg

PS: I am sorry I cannot post the exact code as it is not open source,
but then it would be too large anyway.

function = sim(tsdata, ...) {
  domaccs = function(tsdata, ...) {
    mpi.spawn.Rslaves(nslaves=length(tsdata), needlog=FALSE)
    res = mpi.parLapply(tsdata, 
      function(data, ...) {
        require(Sim)
        singleSim(data, ...)
      }, ..., job.num=length(tsdata))
    mpi.close.Rslaves()
    res
  }
  domaccs(tsdata, ...)
}


It is RHEL5 64bit, LAM 7.1.4, 16-node cluster of Opterons

> sessionInfo()
R version 2.9.0 Under development (unstable) (2008-09-30 r46585) 
x86_64-unknown-linux-gnu 

locale:
C

attached base packages:
[1] splines   stats     graphics  utils     datasets  grDevices methods

[8] base     

other attached packages:
 [1] Rmpi_0.5-7         Sim_0.2.41      Data_0.2.20        NagLib_0.1.7

 [5] Finance_0.1.77     DBConn_0.2.24   ROracle_0.5-9      RODBC_1.2-3

 [9] DBI_0.2-4          Calendar_0.2.88 Base_0.1.36    

loaded via a namespace (and not attached):
[1] tools_2.9.0

Dr Oleg Sklyar
Research Technologist
AHL / Man Investments Ltd
+44 (0)20 7144 3107
osklyar at maninvestments.com

**********************************************************************
Please consider the environment before printing this email or its attachments.
The contents of this email are for the named addressees ...{{dropped:19}}



More information about the R-sig-hpc mailing list