[Rd] Is there a way to disable / warn about forking?
thomas.friedrichsmeier at ruhr-uni-bochum.de
Tue Oct 4 20:05:16 CEST 2011
On Tuesday 04 October 2011, Simon Urbanek wrote:
> I don't see why this should be anything new - this is already happening
> since both packages that were folded into parallel (snow and multicore)
> are well known and well used.
> In multicore we were explicitly warning about this and also working around
> issues where possible (e.g. the Mac GUI, for example). Judging by the
> widespread use of multicore and the absence of problem reports related to
> GUIs, my impression would be that this aspect is not really a problem
> (more below). We get more users confused about the inability to perform
> side-effects than this, for example.
Well, some users do heed the advice to address their problem reports to the
package / GUI maintainers, esp., if they experience that the problem only
occurs with the GUI loaded, not in a "plain" R session.
We've had a problem report about using mclapply() for a while in the RKWard
bug tracker, already.
> In general, there are two main issues that can be addressed by the GUI:
> a) shared file descriptors. This is a problem if the GUI uses FDs for
> communication and they are not closed in the child instance. You don't
> want both the child and the parent to process those FDs. E.g., closeAll()
> can be used to work around that issue and with parallel there could be an
> easier interface for this given that it's in core R.
> b) event loop. If the GUI hooks into the event loop then, obviously, this
> is only intended to be run from the master. multicore was already
> disabling the even loop hook for AQUA, but it was hard to provide a more
> comprehensive solution since it needed cooperation of R. In parallel it's
> much easier, because it can modify R to allow the event loop conditionally
> and thus only in the master process.
For me the problem set was having multiple threads + mutexes, linking to a
library that installs a SIGCHLD handler, code waiting for the "communicator"
thread to negotiate something with the frontend, except that thread doesn't
exist in the fork()ed child process...
After spending the day debugging, I think, I have finally solved the key issues
for RKWard. That also means the issue is mostly painless for me, now. However,
addressing fork()-related issues is not always a trivial exercise, and I
continue to think that it could be useful for maintainers of "problematic"
packages to have a way to stop / warn direct and indirect users running
> The whole point of parallel is that it can do more than an external
> package, so I think you're going about it the wrong way - you should be
> talking to us much earlier so whatever your constraints in RKWard can be
> possibly addressed by the infrastructure. Also note that a lot of this
> should be seamless, a lot of users don't care what the infrastructure is,
> they just want their task to run in parallel, they don't care about
> mcfork() and the like - the choices will be made for them, because there
> is no fork on Windows, for example.
Exactly. I want the choice to be made for the user, where reasonably possible.
My point is that knowing whether you're on Windows or a Unix is not enough to
decide on the technique to use, in this case. Reliably enumerating all corner
cases where forking could be a problem on Unix is probably next to impossible.
The developers responsible for those corner cases have a decent chance to be
aware of the problem, though. And thus, I think it would be a good idea, if
they had a standard way of informing library(parallel), and any third party
using library(parallel), if there is a problem with forking.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 198 bytes
Desc: This is a digitally signed message part.
More information about the R-devel