[R] R on dual-core machines
Prof Brian Ripley
ripley at stats.ox.ac.uk
Mon Jan 30 15:37:35 CET 2006
On Mon, 30 Jan 2006, [iso-8859-2] Alea }iberna wrote:
> Dear expeRts!
> I'm thinking of buying a new computer and am considering dual-core
> processors, such as AMD Athlon64 X2. Since I'm not a computer expert, pleas
> forgive me if some of my questions are silly.
> First, am I correct that using a dual-core processor is (for R point of
> view) the same as using a computer with two processors?
Depends on the OS (R does not get to see at that level), but that's a fair
presumption. For example, our dual dual-core Opteron box is reported as
having four processors by Linux.
> If that is true, the posts I found on the list imply that using such a
> processor can usually bring significant improvements (in computational time)
> only if the case where the core (C or sometimes R) is specially designed for
> multiple processors (see comments below).
> So based on these and other comments I can conclude that if I'm not prepared
> (able) to make such modifications, I can aspect improvements only in this
> two areas:
> 1. If I am running two instances of R.
> 2. If I'm running several other programs on the computer beside R, the
> programs and R would run faster, since they would not "compete" for
> processor time (so much)
Yes, but running multiple R instances can be very useful.
Our long-term experience with multiple-processor machines is that you do
need to ensure you have adequate RAM and plenty of swap space, especially
on OSes that do not handle out-of-swap gracefully.
> Thanks in advance for any useful suggestions,
> Ales Ziberna
> P.S.: Useful posts on the list follow:
> It depends on the usage pattern. If you run multiple CPU-bound processes in
> parallel without too much coordination (parallel make is a good example,
> simulations another), then you get close to double up from a dual. For a
> single R process, you can get something like 40% improvement in large linear
> algebra problems, using a threaded ATLAS.
> For other problems the speedup is basically nil. There is some potential in
> threading R or (much easier) some of its vector operations, but that is not
> even on the drawing board at this stage.
> If you want to exploit multiple processors, you can write code (e.g., in C)
> called from R (e.g., through .Call or .C) that performs parallel/threaded
> computations in a thread-safe way (e.g., without calling back into R).
> Another possibility is to replace the BLAS/LAPACK library with a thread-safe
> version. This provides a boost to those R algorithms exploiting these
Replace `thread-safe' by `multi-threaded', as you do need the BLAS to
use multiple threads itself (not be told to). See the R-admin manul for
how to do this with Linux versions of OS.
> An alternative is to do all of the parallelization within R using nice tools
> like the snow package combined with Rmpi. If your task is computationally
> intensive on the R side, but not on the client, then parallelizing R code
> may be the better way to go. All depends on your application, I think.
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help