[Rd] Running two R instances at the same time
Peter Juhasz
peter.juhasz83 at gmail.com
Sat Sep 5 20:31:18 CEST 2009
Reposting from R-help:
Dear R experts,
please excuse me for writing to the mailing list without subscribing.
I have a somewhat urgent problem that relates to R.
I have to process large amounts of data with R - I'm in an
international collaboration and the data processing protocol is fixed,
that is a specific set of R commands has to be used.
I wrote a perl program that manages creation of data subsets from my
database and feeds these subsets to an R process via pipes.
This worked all right, however, I wanted to speed things up by
exploiting the fact that I have a dual-core machine. So I rewrote my
perl driver program to use two threads, each starting its own R
instance, getting data off a queue and feeding it to its R process.
This also worked, except that I noticed something very peculiar: the
processing time was almost exactly the same for both cases. I did some
tests to look at this, and it seems that R needs twice the time to do
the exact same thing if there are two instances of it running.
I don't understand how is this possible. Maybe there is an issue of
thread-safety with the R backend, meaning that the two R *interpreter*
instances are talking to the same backend that's capable of processing
only one thing at a time?
Technical details: OS was Ubuntu 9.04 running on a Core2Dou E7300, and
the R version used was the default one from the Ubuntu repository.
Please see http://www.perlmonks.org/?node_id=792460 for an extended
discussion of the problem, and especially
http://www.perlmonks.org/?node_id=793506 for excerpts of output and
actual code.
I have received several suggestions about R packages that would enable
parallel processing in some way or other, and I'm thankful for those.
However, at this point I'm interested in having two completely
unrelated R processes that run simultaneously, not in parallel
processing from within R.
I have to admit that I'm an absolute beginner when it comes to R and
this project will be finished before I could learn everything I'd need
for a pure R solution. I'm familiar with perl, however, so I'd like to
stick to that.
Thanks for your answers in advance and please excuse me if this causes
too much noise:
Péter Juhász
physicist
More information about the R-devel
mailing list