[R-SIG-Mac] How to Speed up R on the G5

Tue Feb 8 01:43:36 CET 2005

On Feb 7, 2005, at 4:21 PM, Bill Northcott wrote:

>>
>> My second question is whether there are ways other than using
>> --with-blas="-framework vecLib", to take advantage of what I thought
>> was the power of the G5 (or dual G5s in my case).
>
> Run top and see if you are using both cpus.  If not then Rmpi or 
> something like that may pay big dividends.

R will only use one CPU, AFAIK vecLib is not going to automagically 
become ScaLAPACK or anything like that (I think Intel's MKL can do 
this, but I don't know that anyone's ever gotten in working on R). 
FWIW, all versions of R are single processor (it doesn't even have a 
GIL) so that won't make much difference when comparing the two systems.

>
>> Finally, he suggested looking into the AbSoft compilers. But, I
>> figured I'd save my money and see if other folks have had luck with
>> those yet.
>
> As far as I can see the IBM (not Absoft) xlf and xlc compilers are 
> significantly faster, although Apple is working hard on gcc to close 
> the gap.
>
> Other thoughts:
> 1.  I don't think there is any point wasting time on Fortran.  The 
> base R distribution as built on a Mac uses no Fortran code. As far as 
> I can see very few R packages use Fortran.
>
> 2.  Some one else mentioned MCMCs.  These are embarassingly parallel 
> applications and if they are not using both CPUs they are going to be 
> inefficient.

It may also be that there is a large branching penalty for spending 
time in interpreted code (i.e. within R itself) on the G5 when compared 
to the Opteron---you can see that from the SPECint benchmarks. AFAIK 
the Opteron has a very small number of inflight instructions (fewer 
even than the Pentium 3/4. Speaking of which, I was at a thing with a 
couple of guys from SLAC and they were mentioning that the best way to 
boost P4 performance is to turn off SMT), something like 90 compared to 
the G5's 200 or so.

>
> Finally some (so far very preliminary) experience:
> I have spent a little time on JAGS, a WinBUGS (MCMC) work alike which 
> uses the standalone libRmath.  Running the WinBUGS kidney example, 
> this code spends almost all its time in the libm functions power, exp 
> and log which are called from the Weibull distribution functions in R. 
>  AFAIK these are not vectorised. At the moment I not comparing Mac vs 
> PC but WinBUGS vs JAGS.  The author of JAGS thinks the sampling code 
> is inefficient, hence the libm functions are called too often.  I am 
> interested in trying to replace the calls through libRmath into libm 
> with vectorised code, which I suspect will be much more effective on 
> the Mac.
>
> Of course aggressively optimising the compilation of the JAGS code 
> makes absolutely no discernable difference to overall performance in 
> my example.

Interesting. I don't know how much vector version would help with 
something like a Gibbs sampler where the next draw depends on values 
that each on each iteration through the sampler...

>
> Bill Northcott
>
> _______________________________________________
> R-SIG-Mac mailing list
> R-SIG-Mac at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/r-sig-mac
>
---
Byron Ellis (ellis at stat.harvard.edu)
"Oook" -- The Librarian