[Rd] Tesla GPUs [Was: Manipulating single-precision (float) arrays in .Call functions]

Tue Jul 19 18:56:15 CEST 2011

On Jul 19, 2011, at 2:26 AM, Prof Brian Ripley wrote:

> On Mon, 18 Jul 2011, Alireza Mahani wrote:
> 
>> Simon,
>> 
>> Thank you for elaborating on the limitations of R in handling float types. I
>> think I'm pretty much there with you.
>> 
>> As for the insufficiency of single-precision math (and hence limitations of
>> GPU), my personal take so far has been that double-precision becomes crucial
>> when some sort of error accumulation occurs. For example, in differential
>> equations where boundary values are integrated to arrive at interior values,
>> etc. On the other hand, in my personal line of work (Hierarchical Bayesian
>> models for quantitative marketing), we have so much inherent uncertainty and
>> noise at so many levels in the problem (and no significant error
>> accumulation sources) that single vs double precision issue is often
>> inconsequential for us. So I think it really depends on the field as well as
>> the nature of the problem.
> 
> The main reason to use only double precision in R was that on modern CPUs double precision calculations are as fast as single-precision ones, and with 64-bit CPUs they are a single access.  So the extra precision comes more-or-less for free.  You also under-estimate the extent to which stability of commonly used algorithms relies on double precision.  (There are stable single-precision versions, but they are no longer commonly used.  And as Simon said, in some cases stability is ensured by using extra precision where available.)
> 
> I disagree slightly with Simon on GPUs: I am told by local experts that the double-precision on the latest GPUs (those from the last year or so) is perfectly usable.  See the performance claims on http://en.wikipedia.org/wiki/Nvidia_Tesla of about 50% of the SP performance in DP.
> 

That would be good news. Unfortunately those seem to be still targeted at a specialized market and are not really graphics cards in traditional sense. Although this is sort of required for the purpose it removes the benefit of ubiquity. So, yes, I agree with you that it may be an interesting way forward, but I fear it's too much of a niche to be widely supported. I may want to ask our GPU specialists here to see if they have any around so I could re-visit our OpenCL R benchmarks. Last time we abandoned our OpenCL R plans exactly due to the lack of speed in double precision.

Thanks,
Simon