[Rd] Tesla GPUs [Was: Manipulating single-precision (float) arrays in .Call functions]
simon.urbanek at r-project.org
Tue Jul 19 18:56:15 CEST 2011
On Jul 19, 2011, at 2:26 AM, Prof Brian Ripley wrote:
> On Mon, 18 Jul 2011, Alireza Mahani wrote:
>> Thank you for elaborating on the limitations of R in handling float types. I
>> think I'm pretty much there with you.
>> As for the insufficiency of single-precision math (and hence limitations of
>> GPU), my personal take so far has been that double-precision becomes crucial
>> when some sort of error accumulation occurs. For example, in differential
>> equations where boundary values are integrated to arrive at interior values,
>> etc. On the other hand, in my personal line of work (Hierarchical Bayesian
>> models for quantitative marketing), we have so much inherent uncertainty and
>> noise at so many levels in the problem (and no significant error
>> accumulation sources) that single vs double precision issue is often
>> inconsequential for us. So I think it really depends on the field as well as
>> the nature of the problem.
> The main reason to use only double precision in R was that on modern CPUs double precision calculations are as fast as single-precision ones, and with 64-bit CPUs they are a single access. So the extra precision comes more-or-less for free. You also under-estimate the extent to which stability of commonly used algorithms relies on double precision. (There are stable single-precision versions, but they are no longer commonly used. And as Simon said, in some cases stability is ensured by using extra precision where available.)
> I disagree slightly with Simon on GPUs: I am told by local experts that the double-precision on the latest GPUs (those from the last year or so) is perfectly usable. See the performance claims on http://en.wikipedia.org/wiki/Nvidia_Tesla of about 50% of the SP performance in DP.
That would be good news. Unfortunately those seem to be still targeted at a specialized market and are not really graphics cards in traditional sense. Although this is sort of required for the purpose it removes the benefit of ubiquity. So, yes, I agree with you that it may be an interesting way forward, but I fear it's too much of a niche to be widely supported. I may want to ask our GPU specialists here to see if they have any around so I could re-visit our OpenCL R benchmarks. Last time we abandoned our OpenCL R plans exactly due to the lack of speed in double precision.
More information about the R-devel