[R] R on dual-core machines
Aleš Žiberna
ales.ziberna at gmail.com
Mon Jan 30 13:53:22 CET 2006
Dear expeRts!
I'm thinking of buying a new computer and am considering dual-core
processors, such as AMD Athlon64 X2. Since I'm not a computer expert, pleas
forgive me if some of my questions are silly.
First, am I correct that using a dual-core processor is (for R point of
view) the same as using a computer with two processors?
If that is true, the posts I found on the list imply that using such a
processor can usually bring significant improvements (in computational time)
only if the case where the core (C or sometimes R) is specially designed for
multiple processors (see comments below).
So based on these and other comments I can conclude that if I'm not prepared
(able) to make such modifications, I can aspect improvements only in this
two areas:
1. If I am running two instances of R.
2. If I'm running several other programs on the computer beside R, the
programs and R would run faster, since they would not "compete" for
processor time (so much)
Thanks in advance for any useful suggestions,
Ales Ziberna
P.S.: Useful posts on the list follow:
It depends on the usage pattern. If you run multiple CPU-bound processes in
parallel without too much coordination (parallel make is a good example,
simulations another), then you get close to double up from a dual. For a
single R process, you can get something like 40% improvement in large linear
algebra problems, using a threaded ATLAS.
For other problems the speedup is basically nil. There is some potential in
threading R or (much easier) some of its vector operations, but that is not
even on the drawing board at this stage.
------------------------------------------------------------------------
If you want to exploit multiple processors, you can write code (e.g., in C)
called from R (e.g., through .Call or .C) that performs parallel/threaded
computations in a thread-safe way (e.g., without calling back into R).
---
Another possibility is to replace the BLAS/LAPACK library with a thread-safe
version. This provides a boost to those R algorithms exploiting these
libraries.
---
An alternative is to do all of the parallelization within R using nice tools
like the snow package combined with Rmpi. If your task is computationally
intensive on the R side, but not on the client, then parallelizing R code
may be the better way to go. All depends on your application, I think.
More information about the R-help
mailing list