Please join us for two R events at SC13
in Denver, this November 17-22: A
tutorial on pbdR and a BoF
discussion session to help build a community of users and experts
interested in using R on supercomputers.
(1) Tutorial:
Introducing R: from Your Laptop to HPC and Big Data
Monday,
November 18, 2013, 1:30 - 5:00 PM
The R language has been called the lingua franca of data analysis and
statistical computing, and is quickly becoming the de facto standard for
analytics. As such, R is the tool of choice for many working in the
fields of machine learning, statistics, and data mining. This tutorial
will introduce attendees to the basics of the R language with a focus on
its recent high performance extensions enabled by the ``Programming with
Big Data in R'' (pbdR ) project. Although R has a
reputation for lacking scalability, our initial experiments with pbdR
have easily scaled to 12 thousand cores. No background in R is assumed
but even R veterans will benefit greatly from the session. We will cover
only those basics of R that are needed for the HPC portion of the
tutorial. The tutorial is very much example-oriented, with many
opportunities for the engaged attendee to follow along. Examples will
utilize common data analytics techniques, such as principal components
analysis and cluster analysis.
(2) Birds-of-a-Feather Session:
Super-R: Supercomputing and R for Data-Intensive Analysis
Wednesday, November 20, 2013, 5:30 - 7:00 PM
R has become popular for data analysis in many fields, drawing power
from its high-level expressiveness and numerous domain-specific
packages. While R is clearly a "high productivity" language, it is not
known as a "high performance" language. However, recent efforts have
resulted in methods for effectively scaling R to the power of
supercomputers. This BOF will consist of presentations at the
intersection of supercomputing and R, followed by audience discussion to
share experiences, needs, and questions. The ultimate goal is to help
build a community of users and experts interested in applying R to solve
data intensive problems on supercomputers.
