[R] Reasons to Use R
rdiaz02 at gmail.com
Fri Apr 6 21:18:29 CEST 2007
I'll try not to repeat what other have answered before.
On 4/5/07, Lorenzo Isella <lorenzo.isella at gmail.com> wrote:
> The institute I work for is organizing an internal workshop for High
> Performance Computing (HPC).
> (1)Institutions (not only academia) using R
You can count my institution too. Several groups. (I can provide more
details off-list if you want).
> (2)Hardware requirements, possibly benchmarks
> (3)R & clusters, R & multiple CPU machines, R performance on different hardware.
We do use R in commodity off-the shelf clusters; our two clusters are
running Debian GNU/Linux; both 32-bit machines ---Xeons--- and 64-bit
machines ---dual-core AMD Opterons. We use parallelization quite a
bit, with MPI (via Rmpi and papply packages mainly). One convenient
feature is that (once the lam universe is up and running) whether we
are using the 4 cores in a single box, or the max available 120, is
completeley transparent. Using R and MPI is, really, a piece of cake.
That said, there are things that I miss; in particular, oftentimes I
wish R were Erlang or Oz because of the straightforward fault-tolerant
distributed computing and the built-in abstractions for distribution
and concurrency. The issue of multithreading has come up several times
in this list and is something that some people miss.
I am not sure how much R is used in the usual HPC realms. It is my
understanding that the "traditional HPC" is still dominated by things
such as HPF, and C with MPI, OpenMP, or UPC or Cilk. The usual answer
to "but R is too slow" is "but you can write Fortran or C code for the
bottlenecks and call it from R". I guess you could use, say, UPC in
that C that is linked to R, but I have no experience. And I think this
code can become a pain to write and maintain (specially if you want to
play around with what you try to parallelize, etc). My feeling (based
on no information or documentation whatsoever) is that how far R can
be stretched or extended into HPC is still an open question.
> (4)finally, a list of the advantages for using R over commercial
> statistical packages. The money-saving in itself is not a reason good
> enough and some people are scared by the lack of professional support,
> though this mailing list is simply wonderful.
(In addition to all the already mentioned answers)
Complete source code availability. Being able to look at the C source
code for a few things has been invaluable for me.
And, of course, and extremely active, responsive, and vibrant
community that, among other things, has contributed packages and code
for an incredible range of problems.
P.S. I'd be interested in hearing about the responses you get to your
> Kind Regards
> Lorenzo Isella
> R-help at stat.math.ethz.ch mailing list
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Statistical Computing Team
Structural Biology and Biocomputing Programme
Spanish National Cancer Centre (CNIO)
More information about the R-help