[R] Preparing for multi-core CPUs and parallel processing applications

Steve_Friedman at nps.gov Steve_Friedman at nps.gov
Fri Jul 31 15:22:19 CEST 2009


I am fortunate (or in really big trouble) in that the research group I work
with will soon be receiving several high end dual quad core machines. We
will use the Ubuntu OS on these.  We intend to use this cluster for some
extensive modeling applications. Our programming guru has demonstrated the
ability to link much simpler machines to share CPUs and we purchased the
new ones to take advantage of this option.  We have also begun exploration
of the R CUDA and J CUDA functionality to push the processes to the
graphics CPU which greatly speeds up the numerical processing.

My question(s) to this group:

1)   Which packages are suitable for parallel processing applications in R
2)  Are these packages ready for prime time applications or are they
developmental at this time?
3)  Are we better off working in Java or C++ for the majority of this
simulation work and linking to R for statistical analysis?
4)  What are the pit falls, if any, that I need to be aware of ?
5)  Can we take advantage of sharing the graphics CPU, via R CUDA, in a
parallel distributed shared cluster of dedicated machines ?

6)  Our statistical analysis and modeling applications address very large
geographic issues.  We generally work with 30-40 year daily time step data
in a grided format. The grid is approximate 250 x 400 cells in extent, each
representing approximately 500 meters x 500 meters.  To this we a very
large suite of ancillary information, both spatial and non-spatial,  to
simulate a variety of ecological state conditions.  My question is - is
this too large for R , given its use of memory?

7)  I currently have a laptop with Ubuntu with R Version 2.6.2
(2008-02-08). What is the most recent R version for Ubuntu and what is the
installation procedure ?

These are just the initial questions that I'm sure to have.  If these are
being directed to the wrong help pages, I'm sorry to have taken your time.
If you would be so kind as to direct me to the more appropriate help site
I'd appreciate your assistance.

Thanks in advance,

Steve Friedman Ph. D.
Spatial Statistical Analyst
Everglades and Dry Tortugas National Park
950 N Krome Ave (3rd Floor)
Homestead, Florida 33034

Steve_Friedman at nps.gov
Office (305) 224 - 4282
Fax     (305) 224 - 4147

More information about the R-help mailing list