[Rd] [R] multithreading calling from the rpy Python package

René J.V. Bertin rjvbertin at gmail.com
Thu Oct 12 19:12:43 CEST 2006


Thanks, Duncan,

> It is a mixture of two things. Yes, R is not thread safe so if
> two system threads were to access R concurrently, bad things would
> happen a.s.

  That's clear, yes. :-/ And a pity, but so be it.

> It is also an issue when Python is compiled and linked with
> threaded options and routines from  the system, e.g. libpthread
> and R is not.  When R is dynamically loaded into the Python
> process, unless R is very carefully compiled, symbols (i.e. routines)

I built Python with --enable-threads, but I don't think R has a build
option for this?

> that R uses will come from the Python executable and these may not
> agree with R's view at compilation. And bad things happen.

  But that would also happen in single-threaded applications, and it
doesn't. Unless I'm understanding you wrong...

> This depends on your operating system, and it doesn't appear that
> you have told us what that is. Bad boy :-)

  Indeed it depends on the OS. Read again. It says (somewhere...) that
I'm using Mac Os X 10.4.8 :P . And under that OS, symbols are not
visible by default across shared libraries.

> This is an issue with Rpy, RSPython, RSPerl, R apache module, rJava, .......

Rpy only allows the creation of a single R "instance". Suppose it were
possible, it probably wouldn't help to create as many instances as
there are to be threads, right? The "memory not mapped" error message
suggests one thread tries to access memory that was just freed by
another thread. A bit surprising maybe that this happens in a function
that appears to be intended to be recursive (judging from the
traceback). As far as I understand, thread-safe means re-entrant which
means recursive-safe too...

...

> e.g. make it extensible at the native level.  For stat. computing
> to continue to grow and for all of us to be able to explore newer
> areas, we probably need to think about building infrastructure for the
> next 5- 10 years and not continue to tweak a model that has been around
> for 30 years.  How we do this requires some serious thought

  I can't agree more, but have no suggestions....

> and evaluating trade-offs of building things ourselves with a small
> community or leveraging other existing or emerging systems, e.g. Python,
> Perl6/Parrot, etc.

  Well, Python is great, numpy and scipy allow one to do serious work,
but there are things in which R has a clear advantage. Just to name
some: handling of missing values is one (and the reason I'm not using
numpy or scipy's var function). Slicing is another (somewhat
cumbersome in Python), data.frames yet another. I'm not sure how easy
it would be to extend Python's syntax to accomodate for something
useful as

a[ is.na(a) ] <- -1

R.B.




More information about the R-devel mailing list