[R] Re: R-1.1.0 is released : GUI

Peter Dalgaard BSA p.dalgaard at biostat.ku.dk
Sat Jun 17 16:11:05 CEST 2000

"Yves Gauvreau" <cyg at sympatico.ca> writes:

> How much time and effort would
> be required to improve R capabilities by a factor of 2 say? I think it would
> be lots and lots. How much work would it be to give R parallelism I think it
> could be a lot less. All together I think asking the core team to improve
> the performance of R knowing that probably only very few would really
> benefit is asking to much. I would be curious to know what it would involve
> to provide R with parallelism enlisting PVM or other similar package?

There are many levels of parallelism, not all of which are equally
easy to exploit. At the coarsest level, each machine does essentially
its own thing in a separate process. This is very easy to implement
and quite useful for e.g. simulations. For a somewhat more elaborate
version of this there is the Condor project at Wisconsin which allows
processes to be moved around between different CPUs - R has been tested
on this but I don't know how far they've got. 

At the finest level there's a potential for very highly optimized
algorithms for specific purposes like matrix inversion, sorting, and
differential equation solving. Even though they have been known to be
theoretically possible for ages, actual implementations seem to be
slow in coming. This probably has to do with hardware dependencies and
the transience of supercomputer architectures -- who wants to code for
a hypercube machine if it is going out of production in a year? Also,
massively parallel machines have been slow in approaching affordable
levels. Consequentially, those experimental architectures seem to be
on the way out with simpler systems coming in. In a way this is a
natural choice since it is the choice between running two jobs each at
double speed versus running them concurrently. (I'm a bit puzzled why
noone has come up with specific hardware cards for e.g.
O(n)-complexity sorting, though.) R probably has nothing contribute in
this area, but we can try to utilize optimized libraries as they
become available.

At an intermediate level, many things in R are parallelizable, e.g.
most of the apply()-style functions and scheduling these to separate
threads on multiprocessor machine could give potentially large
speedups. The possibility of running separate threads within R has
been on the wishlist also for other reasons (particularly for keeping
a GUI running during heavy computations). The synchronisation issues
and the corresponding language elements are not entirely trivial,

   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list