[Rd] Distributed computing

Prof Brian Ripley ripley at stats.ox.ac.uk
Wed Mar 24 09:03:07 CET 2004


Fei Chen implemented distribution of data and ScaLAPACK as part of his 
DPhil thesis, with a high-level R interface.  Moving data around is often 
the major limiting factor on large-scale model fitting (he was 
experimenting with glm's).

There are two brief papers at

http://www.isi-2003.de/guest/3427.pdf?MItabObj=pcoabstract&MIcolObj=uploadpaper&MInamObj=id&MIvalObj=3427&MItypeObj=application/pdf

adn in the DSC2003 proceedings  (but the ci.tuwien server is currently not 
available, at least from here).

Now Fei's process is complete, perhaps he will make the thesis available 
on line.


On Tue, 23 Mar 2004 gte810u at mail.gatech.edu wrote:

Quoting someone unamed! --

> > My inclination would be to, whenever possible, replace the core scalar
> > libraries with compatible parallel versions (lapack -> scalapack),
> > rather than make it an add-on package. If the R client code is general
> > enough, and the make file can automatically find the parallel version,
> > then its a simple matter of compiling with the parallel libs. (Don't
> > know if this is possible at run-time.) No rewriting (high level) R code
> > at all. I tried to contact the plapack folks here at UT about
> > integrating with R, but it appears the project is no longer active.
> 
> Unfortunately, there is a major complication to this approach:  the distribution
> of data.  ScaLAPACK (and PLAPACK) requires the data to be distributed in a
> special way before calculation functions can be called.  Given a generic R
> matrix, we have to distribute the data before we can call ScaLAPACK functions on
> it.  We then have to collect the answer before we can return it to R.  Because
> of this serious overhead, replacing all LAPACK calls with ScaLAPACK calls would
> not be recommended.  Future versions of our package [1] may include some type of
> automatic benchmarking to decide when problems are large enough to be worth
> sending to ScaLAPACK.
> 
> 
> David Bauer
> 
> [1] http://www.aspect-sdm.org/Parallel-R/
> 
> ______________________________________________
> R-devel at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-devel
> 
> 

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list