[R-SIG-Mac] How to Speed up R on the G5

Jake Bowers jbowers at csm.Berkeley.EDU
Mon Feb 7 16:10:45 CET 2005

Hi All,

I've been receiving some friendly grief from a friend with a Linux
dual-Opteron system about the performance of his R package on the OS X G5

He has suggested recompiling R-patched with a variety of different
compilers and compiler flags. And has also suggested just recompiling
his package with different flags and compilers (while leaving
r-patched as I have currently built it using gcc 3.3 20030304 (Apple
Computer, Inc. build 1671), and g77 3.4.2 (from that wonderful site:

I have now successfully recompiled R using a few different
configurations. Each one builds and passes make check (except for
reg-tests-1.R <-- which has failed in all cases and also on my debian
box, which suggests that there is something going on with
reg-tests-1.R in r-patched that is not OS X dependent)

My first question is how to play with these different versions without
breaking my production version? That is, I don't want to have to
delete my currently working build of R-patched each time I want to run
a speed test.

My second question is whether there are ways other than using
--with-blas="-framework vecLib", to take advantage of what I thought
was the power of the G5 (or dual G5s in my case).

I'm sure this is a complete newbie type of question, and
I apologize in advance for my ignorance!

For those of you who are interested, here are some ways that I've
been trying to optimize R for the G5. I can't report speed tests yet
because of my inexperience with compiling things (as made clear by my
first question!).

FYI, I'm building versions in the most stripped down way that I can
envision, since I mainly just want speed. I'm also doing make
distclean in between builds, and hand editing tests/Makefile to delete
the reference to reg-tests-1.R after it fails. And I am using
r-patched updated via svn update yesterday.

Here is what I'm playing with:

1) One set of builds with standard compilers and flags
(--with-blas="-framework vecLib" --with-lapack")

2) One build like (1) but using the libgoto.dylib version of BLAS and
the vecLib stuff for lapack (It doesn't work with just
--with-blas"-L/usr/local/lib -lgoto"
--with-lapack). (http://www.cs.utexas.edu/users/kgoto/signup_first.html#For_OS_X)

./configure --with-blas="-L/usr/local/lib -lgoto"
--with-lapack="-framework vecLib" --without-aqua --with-x
--disable-R-shlib --disable-R-profiling --without-recommended-packages

3) Another set of builds with some compiler flags:
 C compiler:                /usr/bin/gcc  -g -O3 -mcpu=970 -mtune=970 -mpowerpc64 -mpowerpc-gpopt -force_cpusubtype_ALL
 C++ compiler:              g++  -g -O3 -mcpu=970 -mtune=970 -mpowerpc64 -mpowerpc-gpopt -force_cpusubtype_ALL
 Fortran compiler:          g77  -O3 -mcpu=970 -mtune=970 -mpowerpc64 -mpowerpc-gpopt -force_cpusubtype_ALL

4) Another like (3), but with the libgoto BLAS.

This leaves me with 4 builds to test. I figure I have to say "R CMD
INSTALL thepackage.tar.gz" for each build to test my friend's pacakge. At
least that is what I think... I don't really know if there is a more
direct way.

*Other attempts at optimization which failed:

My friend also suggested using gcc-4.0 with CFLAGS and FFLAGS
including "-ftree-vectorize -maltivec", but this wouldn't completely

Another option was to use other compiler flags on the Apple provided gcc, like this:

 C compiler:  /usr/bin/gcc -g -O3 -funroll-loops -fstrict-aliasing
 -fsched-interblock -falign-loops=16 -falign-jumps=16 -falign-functions=16
 -falign-jumps-max-skip=15 -falign-loops-max-skip=15 -malign-natural
 -ffast-math -mpowerpc-gpopt -force_cpusubtype_ALL -fstrict-aliasing
 -mtune=G5 -mcpu=G5 -mpowerpc64

 C++ compiler:  g++ -g -O3 -mcpu=970 -mtune=970 -mpowerpc64
 -mpowerpc-gpopt -force_cpusubtype_ALL -funroll-loops -fstrict-aliasing
 -fsched-interblock -falign-loops=16 -falign-jumps=16 -falign-functions=16
 -falign-jumps-max-skip=15 -falign-loops-max-skip=15 -malign-natural

  Fortran compiler:  g77 -O3 -funroll-loops -fstrict-aliasing
  -fsched-interblock -falign-loops=16 -falign-jumps=16
  -falign-functions=16 -falign-jumps-max-skip=15 -falign-loops-max-skip=15
  -malign-natural -ffast-math -mpowerpc-gpopt -force_cpusubtype_ALL
  -fstrict-aliasing -mtune=G5 -mcpu=G5 -mpowerpc64

but, although this compiled ok, it failed the make check on the first test (base-Ex.R with:
> tx0 <- c(9, 4, 6, 5, 3, 10, 5, 3, 5)
> x <- rep(0:8, tx0)
> stopifnot(table(x) == tx0)
Warning in table(x) == tx0 : longer object length
	is not a multiple of shorter object length
Error in stopifnot(table(x) == tx0) : dim<- : dims [product 8] do not match the length of object [9]
Execution halted)

Finally, he suggested looking into the AbSoft compilers. But, I
figured I'd save my money and see if other folks have had luck with
those yet.

Thanks very much for any thoughts or help any of y'all might have!


Jake Bowers
Assistant Professor
Dept of Political Science
University of Michigan
jwbowers at umich.edu

More information about the R-SIG-Mac mailing list