[R] Performance difference between 32-bit build and 64-bit bu ild on Solaris 8
Prof Brian Ripley
ripley at stats.ox.ac.uk
Sun Jun 12 00:07:12 CEST 2005
On Sat, 11 Jun 2005, Peter Dalgaard wrote:
> Scott Gilpin <sgilpin at gmail.com> writes:
>
>> Andy, Prof. Ripley - thanks for your replies.
>>
>>> CFLAGS was not specified, so should default to "-g -O2". It is definitely
>>> worth checking, although almost all the time in these tests will be spent
>>> in Fortran code.
>> Yes - I verified that's the default.
>>
>>>>> neither build uses a BLAS.
>>>
>>> Well, they do, the one supplied with R (which is in Fortran).
>>
>> I guess I should have said that neither build is using an optimized
>> BLAS. (which I am planning to install - I just haven't had the chance
>> yet.) I also get a message in config.log "ld: fatal: library -lblas:
>> not found". I need to investigate this more.
>
> Actually, you did say so...
>
> Don't worry about error messages like that in config.log; they only
> mean that there is no system BLAS and the only way to find out is by
> trial and failure. The configure script is full of that sort of code.
>
>>
>>> and for gcc 3.4.3 and the internal BLAS (which I had to build for these
>>> tests) I get
>>>
>>> 32-bit
>>> [1] 9.96 0.09 10.12 0.00 0.00
>>> 64-bit
>>> [1] 9.93 0.04 10.04 0.00 0.00
>>>
>>> so I am not seeing anything like the same performance difference between
>>> 32- and 64-bit builds (but it could well depend on the particular Sparc
>>> chip).
>>
>> These timings are much, much less than what I reported (~700s and
>> 2200s for 32 bit and 64 bit). I read the admin manual and didn't see
>> anything specifically that needs to be set to use the internal BLAS.
>> I guess I'll go back and do some more investigation.
For the record, I timed 1000x1000 not 3000x3000 (and said so). I was not
proposing to spend several hours running timings at ca 2200s each, not
least as I used a public machine with a ban on running long jobs (we have
other much faster machines for that purpose).
> Yes. If you have hardcore linear algebra needs, those fast-BLAS
> speedups can be impressive (Brian might also have a faster machine
> than you, mind you). For code that is mainly interpreter-bound, it is
> much less so.
>
> While your setup is in place, you might want to play around with the
> higher optimization levels. GCC on AMD64 sees a quite substantial
> speedup from -O2 to -O3.
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list