[R-sig-hpc] Intel Phi Coprocessor? -> i5/i7 instruction set in R binary?

ivo welch ivo.welch at gmail.com
Mon Jun 10 20:53:21 CEST 2013

thx again, simon.  by stock R, I mean the compiled binaries in
http://cran.r-project.org/bin/linux/ .  dirk e compiled it as

/etc/R/Makeconf:# configure  '--prefix=/usr' '--with-cairo'
'--with-jpeglib' '--with-readline' '--with-tcltk'
'--with-system-bzlib' '--with-system-pcre' '--with-system-zlib'
'--mandir=/usr/share/man' '--infodir=/usr/share/info'
'--datadir=/usr/share/R/share' '--includedir=/usr/share/R/include'
'--with-blas' '--with-lapack' '--enable-R-profiling'
'--enable-R-shlib' '--enable-memory-profiling'
'--without-recommended-packages' '--build' 'x86_64-linux-gnu'
'build_alias=x86_64-linux-gnu' 'R_PRINTCMD=/usr/bin/lpr'
'R_PAPERSIZE=letter' 'R_BROWSER=xdg-open' 'LIBnn=lib' 'CC=gcc
-std=gnu99' 'CFLAGS=-O2 -pipe -g' 'LDFLAGS=' 'CPPFLAGS='
'F77=gfortran' 'FFLAGS=-O2 -pipe -g' 'CXX=g++' 'CXXFLAGS=-O2 -pipe -g'
'FC=gfortran' 'FCFLAGS=-O2 -pipe -g'
/etc/R/Makeconf:CC = gcc -std=gnu99

I tried the alternative.  I recompiled R with various corei7 and
corei7-avx flags, but it made no difference really.  by no difference,
I mean some examples I tried out that primarily relied on crossprod,
solve and, some mean calculations, repeated often with random numbers
to take some measurable time.  (I did not recompile blas.)  dirk's
version performed almost exactly as well.  this could indeed just be
the compiler, but I am stuck with what I have.

SIMD come in a bewildering set:  MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1,
SSE4.2, AVX, AES and PCLMUL   I would be curious which of the extra
intel instructions the R binary gcc-compiled uses (unix "file" does
not; it only tells me ELF x86-64), and which processor capabilities
really make a difference.  (I believe AMD is said to have become
terrible for double performance.)  understanding the basics would help
me look at the usual benchmark reports and understand what I should be
looking at.  there are R benchmarks, but no one has written a website
to collect and display the results, or this would have been my first
stop. :-(   SiSoft Sands 2013 has some float/double native x8
benchmarks, cinebench may do, too.  I know---YMMMV, but I am not
hoping for great detail.  just basics.  (lots of what I do is very
simple and very plain, just tons of it.  regressions, means, standard
deviations, random numbers, simulations.  nothing fancy.  so
slow-but-many cores processors would be ok for me---but only if pages
remain shared in memory until dirty, simply because my data sets are

apologies for having taken so much bandwidths.  and thx for the
answers.  and please don't feel obliged to spend even more time on me,
though I am probably not the only one curious in the answers to these,
and this does go to the mailing list that can later be googled.


Ivo Welch (ivo.welch at gmail.com)

More information about the R-sig-hpc mailing list