[R-SIG-Mac] Optimization flags for G5 and G4
Simon Urbanek
simon.urbanek at r-project.org
Sun Feb 20 21:57:39 CET 2005
Due to popular demand I decided to post a short info on optimization
flags for Macs.
In general if you want your code to be optimized for (and work only
on!) a specific CPU, say G5, all you have to do is to specify
-mtune=G5 -mcpu=G5
By the way, using "G4" is also legal, of course.
However, there is more than just CPU optimization - alignment,
inlining, loops etc. all this can be handled several ways. Therefore
there are many additional flags you can use for optimization. The most
aggressive flag is "-fast" (see man gcc), but it's not recommended for
R, because it includes -ffast-math which produces fp code which is not
IEEE-conform, and simply produces wrong result is some cases (important
in handling of NAs which just doesn't work in a reliable way). But you
can try most of the other flags.
Note also, that g77 doesn't necessarily support the same arguments as
gcc. You may need to cut-down your list for fortran then.
Using the flags - don't forget to specify both CFLAGS and FFLAGS. In
most cases you probably want to the CXXFLAGS, too, because some
routines (such as svm or gbp for example) use C++ code.
Finally here go some examples of what you may want to try for a G5
machine:
CFLAGS='-O3 -fgcse-sm -funroll-loops -fstrict-aliasing
-fsched-interblock -falign-loops=16 -falign-jumps=16
-falign-functions=16 -falign-jumps-max-skip=15
-falign-loops-max-skip=15 -malign-natural -freorder-blocks
-freorder-blocks-and-partition -mpowerpc-gfxopt -mpowerpc-gpopt
-fstrict-aliasing -ftree-vectorize -mtune=G5 -mcpu=G5'
FFLAGS='-O3 -fgcse-sm -funroll-loops -fstrict-aliasing
-fsched-interblock -falign-loops=16 -falign-jumps=16
-falign-functions=16 -malign-natural -freorder-blocks
-freorder-blocks-and-partition -mpowerpc-gfxopt -mpowerpc-gpopt
-fstrict-aliasing -ftree-vectorize -mtune=G5 -mcpu=G5'
Note that -ftree-vectorize is supported only in gcc 4.0, so remove it
for older compilers. Normally you should set CXXFLAGS to the same as
CFLAGS. Always run make check to see whether the compiled code still
behaves correctly. You mileage may vary.
A word on optimization: the benefit of compiler optimization will vary
a lot with the application. Optimization affects only C/Fortran code in
R and packages, so it is not likely to speed up your application if you
use BLAS heavily, because that is already optimized. Finally, and most
importantly, compiler optimization won't improve the speed if you use
bad R code - your first step should always be to optimize your R code
(use profiler to figure out where most time is spent - there were cases
where 80% of the time was spent on unnecessary coercions). If speed is
really that important to you, try R byte-compiler, it may or may not
help in your case.
Cheers,
Simon
More information about the R-SIG-Mac
mailing list