[R-SIG-Mac] How to Speed up R on the G5
Michael Redmond
redmond at cs.wisc.edu
Mon Feb 7 16:40:53 CET 2005
Jake,
I am also very interested in this. We are running R through an iNquiry
portal, and have found that (at least for an early test) the performance
is not as good as a 2.8Ghz P4 system (11 hours on the P4 vs 13 houes on
the XServe). The P4 has less memory (1G on P4 vs 2G on Xserve nodes). I
am ready to benchmark any improvements.
The application we have is R with Bioconductor doing MCMC, though I
don't have many more specifics. All I know is that the test is
"real-world".
Our installation is via compile from source of R-2.0.1 using fink. No
modifications were made to the standard Make/Install config.
Thanks
Mike
---
Jake Bowers wrote:
> Hi All,
>
> I've been receiving some friendly grief from a friend with a Linux
> dual-Opteron system about the performance of his R package on the OS X G5
> system.
>
> He has suggested recompiling R-patched with a variety of different
> compilers and compiler flags. And has also suggested just recompiling
> his package with different flags and compilers (while leaving
> r-patched as I have currently built it using gcc 3.3 20030304 (Apple
> Computer, Inc. build 1671), and g77 3.4.2 (from that wonderful site:
> hpc.sf.net)).
>
> I have now successfully recompiled R using a few different
> configurations. Each one builds and passes make check (except for
> reg-tests-1.R <-- which has failed in all cases and also on my debian
> box, which suggests that there is something going on with
> reg-tests-1.R in r-patched that is not OS X dependent)
>
> My first question is how to play with these different versions without
> breaking my production version? That is, I don't want to have to
> delete my currently working build of R-patched each time I want to run
> a speed test.
>
> My second question is whether there are ways other than using
> --with-blas="-framework vecLib", to take advantage of what I thought
> was the power of the G5 (or dual G5s in my case).
>
> I'm sure this is a complete newbie type of question, and
> I apologize in advance for my ignorance!
>
> For those of you who are interested, here are some ways that I've
> been trying to optimize R for the G5. I can't report speed tests yet
> because of my inexperience with compiling things (as made clear by my
> first question!).
>
> FYI, I'm building versions in the most stripped down way that I can
> envision, since I mainly just want speed. I'm also doing make
> distclean in between builds, and hand editing tests/Makefile to delete
> the reference to reg-tests-1.R after it fails. And I am using
> r-patched updated via svn update yesterday.
>
> Here is what I'm playing with:
>
> 1) One set of builds with standard compilers and flags
> (--with-blas="-framework vecLib" --with-lapack")
>
> 2) One build like (1) but using the libgoto.dylib version of BLAS and
> the vecLib stuff for lapack (It doesn't work with just
> --with-blas"-L/usr/local/lib -lgoto"
> --with-lapack). (http://www.cs.utexas.edu/users/kgoto/signup_first.html#For_OS_X)
>
> ./configure --with-blas="-L/usr/local/lib -lgoto"
> --with-lapack="-framework vecLib" --without-aqua --with-x
> --disable-R-shlib --disable-R-profiling --without-recommended-packages
>
> 3) Another set of builds with some compiler flags:
> C compiler: /usr/bin/gcc -g -O3 -mcpu=970 -mtune=970 -mpowerpc64 -mpowerpc-gpopt -force_cpusubtype_ALL
> C++ compiler: g++ -g -O3 -mcpu=970 -mtune=970 -mpowerpc64 -mpowerpc-gpopt -force_cpusubtype_ALL
> Fortran compiler: g77 -O3 -mcpu=970 -mtune=970 -mpowerpc64 -mpowerpc-gpopt -force_cpusubtype_ALL
>
> 4) Another like (3), but with the libgoto BLAS.
>
> This leaves me with 4 builds to test. I figure I have to say "R CMD
> INSTALL thepackage.tar.gz" for each build to test my friend's pacakge. At
> least that is what I think... I don't really know if there is a more
> direct way.
>
> *Other attempts at optimization which failed:
>
> My friend also suggested using gcc-4.0 with CFLAGS and FFLAGS
> including "-ftree-vectorize -maltivec", but this wouldn't completely
> build.
>
> Another option was to use other compiler flags on the Apple provided gcc, like this:
>
> C compiler: /usr/bin/gcc -g -O3 -funroll-loops -fstrict-aliasing
> -fsched-interblock -falign-loops=16 -falign-jumps=16 -falign-functions=16
> -falign-jumps-max-skip=15 -falign-loops-max-skip=15 -malign-natural
> -ffast-math -mpowerpc-gpopt -force_cpusubtype_ALL -fstrict-aliasing
> -mtune=G5 -mcpu=G5 -mpowerpc64
>
> C++ compiler: g++ -g -O3 -mcpu=970 -mtune=970 -mpowerpc64
> -mpowerpc-gpopt -force_cpusubtype_ALL -funroll-loops -fstrict-aliasing
> -fsched-interblock -falign-loops=16 -falign-jumps=16 -falign-functions=16
> -falign-jumps-max-skip=15 -falign-loops-max-skip=15 -malign-natural
> -ffast-math
>
> Fortran compiler: g77 -O3 -funroll-loops -fstrict-aliasing
> -fsched-interblock -falign-loops=16 -falign-jumps=16
> -falign-functions=16 -falign-jumps-max-skip=15 -falign-loops-max-skip=15
> -malign-natural -ffast-math -mpowerpc-gpopt -force_cpusubtype_ALL
> -fstrict-aliasing -mtune=G5 -mcpu=G5 -mpowerpc64
>
> but, although this compiled ok, it failed the make check on the first test (base-Ex.R with:
>
>>tx0 <- c(9, 4, 6, 5, 3, 10, 5, 3, 5)
>>x <- rep(0:8, tx0)
>>stopifnot(table(x) == tx0)
>
> Warning in table(x) == tx0 : longer object length
> is not a multiple of shorter object length
> Error in stopifnot(table(x) == tx0) : dim<- : dims [product 8] do not match the length of object [9]
> Execution halted)
>
> Finally, he suggested looking into the AbSoft compilers. But, I
> figured I'd save my money and see if other folks have had luck with
> those yet.
>
> Thanks very much for any thoughts or help any of y'all might have!
>
>
> Jake
>
> Jake Bowers
> Assistant Professor
> Dept of Political Science
> University of Michigan
> jwbowers at umich.edu
> http://www.umich.edu/~jwbowers/
>
> _______________________________________________
> R-SIG-Mac mailing list
> R-SIG-Mac at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/r-sig-mac
More information about the R-SIG-Mac
mailing list