[R] R on dual-core machines

Prof Brian Ripley ripley at stats.ox.ac.uk
Mon Jan 30 15:37:35 CET 2006


On Mon, 30 Jan 2006, [iso-8859-2] Alea }iberna wrote:

> Dear expeRts!
>
> I'm thinking of buying a new computer and am considering dual-core
> processors, such as AMD Athlon64 X2. Since I'm not a computer expert, pleas
> forgive me if some of my questions are silly.
>
> First, am I correct that using a dual-core processor is (for R point of
> view) the same as using a computer with two processors?

Depends on the OS (R does not get to see at that level), but that's a fair 
presumption.  For example, our dual dual-core Opteron box is reported as 
having four processors by Linux.

> If that is true, the posts I found on the list imply that using such a
> processor can usually bring significant improvements (in computational time)
> only if the case where the core (C or sometimes R) is specially designed for
> multiple processors (see comments below).
>
> So based on these and other comments I can conclude that if I'm not prepared
> (able) to make such modifications, I can aspect improvements only in this
> two areas:
> 1.	If I am running two instances of R.
> 2.	If I'm running several other programs on the computer beside R, the
> programs and R would run faster, since they would not "compete" for
> processor time (so much)

Yes, but running multiple R instances can be very useful.

Our long-term experience with multiple-processor machines is that you do 
need to ensure you have adequate RAM and plenty of swap space, especially 
on OSes that do not handle out-of-swap gracefully.

>
> Thanks in advance for any useful suggestions,
> Ales Ziberna
>
> P.S.: Useful posts on the list follow:
>
>
>
> It depends on the usage pattern. If you run multiple CPU-bound processes in
> parallel without too much coordination (parallel make is a good example,
> simulations another), then you get close to double up from a dual. For a
> single R process, you can get something like 40% improvement in large linear
> algebra problems, using a threaded ATLAS.
> For other problems the speedup is basically nil. There is some potential in
> threading R or (much easier) some of its vector operations, but that is not
> even on the drawing board at this stage.
>
> ------------------------------------------------------------------------
>
> If you want to exploit multiple processors, you can write code (e.g., in C)
> called from R (e.g., through .Call or .C) that performs parallel/threaded
> computations in a thread-safe way (e.g., without calling back into R).
> ---
> Another possibility is to replace the BLAS/LAPACK library with a thread-safe
> version. This provides a boost to those R algorithms exploiting these
> libraries.

Replace `thread-safe' by `multi-threaded', as you do need the BLAS to 
use multiple threads itself (not be told to).  See the R-admin manul for 
how to do this with Linux versions of OS.

> ---
> An alternative is to do all of the parallelization within R using nice tools
> like the snow package combined with Rmpi.  If your task is computationally
> intensive on the R side, but not on the client, then parallelizing R code
> may be the better way to go.  All depends on your application, I think.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list