[ESS] R session starts sending commands to wrong place

Ross Boylan ross at biostat.ucsf.edu
Tue Jun 30 01:55:04 CEST 2009


Short version: I have an ess-remote session (under openMPI) and a
regular R session.  Sometimes, typically after a long computation
(hours) in the ess-remote session, commands entered in the ess-remote
session get echoed to and executed by the regular R session.  The
ess-remote session appears hung and unresponsive during this time.

I worked around this by quitting R in the *R* window, responding yes to
several complaints about the R process disappearing, and then typing the
commands in the ess-remote buffer.  This time, they stayed in the
buffer.

Aside from not starting multiple R sessions, is there a good way to
avoid this problem?  Is it a bug?

DETAILS

11433 pts/4    S<s    0:00 /bin/sh
11530 pts/4    S<+    0:54  \_ emacs
14250 pts/0    S<s    0:00      \_ /bin/sh -i
30209 pts/0    S<+    0:00      |   \_ mpirun -np 32 --hostfile hosts
RMPIInteractive
30014 pts/6    S<s+   0:09      \_ /usr/lib64/R/bin/exec/R --no-readline

30214 ?        S<s    0:00 orted --bootproxy 1 --name 0.0.1 --num_procs
5 --vpid_start 0 --nodename n7 --universe ross at n7:default-universe-3
30215 ?        S<     0:00
\_ /bin/bash /home/ross/clean/OLTData/RMPIInteractive
30219 ?        S<   141:09  |   \_ /usr/lib64/R/bin/exec/R --no-save

30219 is the only R process not at 100% CPU, characteristic of openmpi
slaves.  So presumably that is where the commands should be going.
strace showed only
30219 15:49:37 read(0,  <unfinished ...>

I wonder if the cause is some interaction between ESS and openmpi, which
does some input and output redirecting to wire one of the spawned
processes (almost certainly 30219) to my "terminal".  My understanding
is that ess-remote is simply sending commands to that terminal, and
openmpi is taking care of getting them to the master R.

In even more detail:
1. start emacs
2. open shell within emacs
3. execute the mpirun command within the shell.
4. They invoked script does
     R --no-save $*
   for rank 0 and 
     R --no-save $* > rmpi.$RANK 2>&1
   for others.
5. In emacs, invoke ess-remote with language r.
6. In my terminal
	options(error=recover)
   in an effort to avoid death at the first error.
   It does that, but R's machinery still seems to think it's
non-interactive.
7.  do long computation
8.  type a command, most commonly save.image()
9. hit enter.
10. cursor sits blinking on the "(" in save.image()
11. terminal non-responsive to inputs, mostly (in particular, hitting
enter has no effect)
12.  hitting ctl-g seems to cause previous enters to print out, and
better response to keys (at least I can switch to another session).

Variations:
Step 8 is sometimes preceded by some commands executing successfully.
It has hung up on commands involving no obvious disk I/O, e.g., print.
This coupled, with sys admin check that the disk is OK, suggests it's
not a disk problem.

The workspace (.RData) is around 13MB.  It's time stamp seems to match
when I issued the save.image() command, but that is a save from the
regular R process in the same directory.

I am not sure about the relative timing of launching the ess-remote
process and the regular ess process.

Debian Lenny
ess                           5.3.8~svn3917-1
emacs                         22.2+2-5
openmpi-bin                   1.2.7~rc2-2
r-base-core                   2.7.1-1+lenny1



More information about the ESS-help mailing list