[R-sig-hpc] R-sig-hpc] How to configure R to work with GPUs transparently
mseligman at rapidbiologics.com
Fri Mar 19 22:20:39 CET 2010
> I took Brad's posting to mean that he is proposing that many of the more
> computation-intensive R functions be extended, so that the code in lm(),
> say, would first check to see if a GPU and the GPU software are present.
> The code would then take different actions in the two cases (present and
> nonpresent). This is in contrast to a situation in which any R code
> would automatically use GPUs, which as Dirk points out, is not possible.
> By the way, I should also add that while GPUs are great for the
> "embarrassingly parallel" applications, it's hard to make them work well
> for other kinds of parallel apps.
One of the goals of the "gputools" package is to learn just what roles
this type of coprocessor can play in R. We've found significant
speedups (> 10x) in the "lm()" command, for example, but please note a
couple of caveats:
i) This compares a double-precision CPU implementation with a
single-precision GPU implementation.
ii) Data-transfer rates to and from the card are much slower than
to/from RAM. In particular, dimensions greater than about 1000x1000
are required just to see break-even in performance.
As for the first problem, it should be noted that much faster
double-precision arithmetic will be supported in future generations of
GPU. For now, though, the more dramatic speedups are limited to
As for the second problem, the flip side is that you can now call,
say, "lm()" on much larger matrices and get an answer in something
approaching acceptable "user time". If you need to call "lm()" on
smaller matrices many times in a loop, however, a GPU probably will
not buy you much unless you do the added work of implementing the
enclosing loop in the GPU.
More directly to the points you raise, though:
Right now GPU support for R is limited to a small subset of
command-level implementations. These ports are somewhat hard to
implement but, once implemented, they may benefit a large number of
users. They are also relatively easy to maintain as newer, more
powerful GPU hardware becomes available.
It is true that the GPU really shines on embarrassingly-parallel
applications and that communication and synchronization costs subtract
from their potential. The same could be said of a parallel cluster or
a multicore chip, though. If "someone else" has done the work to
provide the tools to make them useful, though, it seems that the GPU,
like these other types of hardware, may have a long-term niche to fill.
Ultimately (pie in the sky?) it would be nice if R itself sniffed out
the user's resident hardware and pulled in libraries built to take
advantage of that particular configuration. In other words, the
details of mapping command to library are kept hidden from the user.
Nicer still if, at run time, R could choose the library into which to
call based on the characteristics of the data - e.g., scalar for a
certain size of matrix, GPU for a very large matrix, some mix of GPU
and multicore for an interative problem with a large matrix. This is
a really tough thing to get right, though.
More information about the R-sig-hpc