[R-sig-Debian] Compiling R-2.11.0 with ATLAS-tuned BLAS and LAPACK

Paul Johnson pauljohn32 at gmail.com
Thu Jun 3 00:27:36 CEST 2010


On Wed, Jun 2, 2010 at 2:00 PM, Dirk Eddelbuettel <edd at debian.org> wrote:
>
> On 2 June 2010 at 13:28, Paul Johnson wrote:
> | On Tue, Jun 1, 2010 at 10:56 PM, Dirk Eddelbuettel <edd at debian.org> wrote:
> | > Who says you need libRblas.so?  We no longer do.
> | >
> | > | different BLAS.
> | > |
> |
> |
> | The R install & admin manual says so, actually.  I think that you know
> | what's going on much more accurately than it does, and perhaps you
> | don't see that doc the way we do.
>
> I actually don't read it much as I have no problems building R (on Linux), so
> maybe I shouldnt have sent you that way as it may have clouded and muddled
> your understanding instead of helping.

> Initially, in your previous email, you said
>
>   This question reminded me I never understood BLAS linkage with R when
>   I asked about it 2 months ago and I forgot to follow up.
>
> and there is really nothing magic here.  There is an interface (called BLAS)
> and a number of interchangeable libraries that can all provide libblas.so.
> They are "simply" arranged in such a manner (by the atlas + lapack packages)
> that the best one is preferred. That's all.  The devil is the detail _and I
> am merely using these facilities from R and other packages_ (which included
> Octave when I still maintained Octave).
>
> All I do it make sure libblas.so is there when R runs configure, so that R
> finds it and builds again it. Presto -- now you get your "pluggability".
>
> Depending on which package (libatlas*, refblas, ...) you have installed,
> running
>
>        ldd ldd /usr/lib/R/bin/exec/R
>
> will point to different libraries standing in libblas.so.
>
> Lastly, I would recommend that you stop worrying about libRblas.so.  Note the
> R in that name. It is a fallback provided by R when the system has nothing
> better.  In a darker age we (as in Debian) had to use it too (as gfortran and
> lapack had issues) but that has long passed.  We now have something better,
> and it works. Enjoy it.
>

Thanks. I've got to make this work for RedHat/Centos, Fedora, and some
Solaris systems, so it helps to get to the core of it.

I'm "pretty sure" this is approximately right. If you think I'm right,
thanks for your tips along the way. If I'm wrong, well, its somebody
else's fault :)

Question: How does the R executable know which BLAS shared library to use.

Answer: It uses whatever so name it was told to use at build time, and
the dynamic linking mechanism of the OS helps it find the file.

If you build R with a configure option that tells it to use an
external BLAS like atlas and you do not have --enable-BLAS-shlib
specified, then no libRblas.so is created, the R executable will look
for the specific location of the BLAS library that it found at compile
time.

Here's what I see with the deb packages from CRAN

$ ldd /usr/lib/R/bin/exec/R
        linux-vdso.so.1 =>  (0x00007fff9c9ff000)
        libR.so => /usr/lib/libR.so (0x00007fcfe9550000)
        libc.so.6 => /lib/libc.so.6 (0x00007fcfe91ce000)
        libblas.so.3gf => /usr/lib/libblas.so.3gf (0x00007fcfe8f32000)

In Ubuntu 10.04 (lucid), the installation of R from CRAN (r-base,
r-base-core, etc) causes the installation of the addon packages
"libblas-dev" and "libblas3gf" and you see above that R is linked
against it.

In theory, according to README.Atlas.gz, it should be possible to have
several BLAS library collections installed at once.  From Atlas, we
could have "base","sse", and "sse2".  The docs say "sse2" is best.

In my Ubuntu system,  I have the universe repository enabled, but I
don't see the Atlas sse or sse2 versions (even though the launchpad
listing says those packages exist.)  For Ubuntu 10.04, for blas we
have the one r-base-core pulls in,

libblas-dev   version 1.2-2build1
libblas3gf     version 1.2-2build1

and uninstalled:

libatlas3gf-base version 3.6.0-24
libatlas-base-dev
libatlas-headers

The libblas-dev installs

/usr/lib64/libblas.so.3gf.

And libatlas3gf-base installs in a subdir:

/usr/lib64/atlas/libblas.so.3gf.

R goes MUCH FASTER if libatlas3gf-base is installed.

Before installing libatlas3gf, I get this:

> mm <- matrix(rnorm(10^7), ncol = 10^3)
> system.time(crossprod(mm))
  user  system elapsed
  9.390   0.010   9.424

After installing the libatlas-3gf and restarting the computer, it
became MUCH faster:

> mm <- matrix(rnorm(10^7), ncol = 10^3)
>
> system.time(crossprod(mm))
   user  system elapsed
  2.250   0.000   2.254

I've installed and uninstalled that package several times and it gets
slower and faster. The minimal conclusion I draw from this is that
Ubuntu users should install the libatlas3gf-base.

Clearly, there is some dynamic linking "magic" going on so that the
system knows which libatlas3gf.so.to use when R asks for it.  I have
not seen this before, were 2 identically named so files exist. But
check the output of

$ /sbin/ldconfig -p


  libblas.so.3gf (libc6,x86-64) => /usr/lib/atlas/libblas.so.3gf
  libblas.so.3gf (libc6,x86-64) => /usr/lib/libblas.so.3gf

Hm. 2 libraries with the exact same name, the one in the atlas
directory is found first, so R uses it.  If I remove libatlas3gf-base,
then, of course, the only one that is found is from libblas3gf.

Question: How can one experiment with other versions of BLAS?

Answer: Either replace the file /usr/lib/libblas.so.3gf with some
other shared object file, or rebuild R using --enable-BLAS-shlib and
replace that.

Explanation:

The README.Atlas is pretty outdated.
It outlines a testing procedure, a script that uses the package
manager to remove all blas packages, run R, then install an atlas
package, run R, then install a different atlas, and so forth. That
does not work on current Ubuntu. The package system will not allow you
to remove libblas3gf to test this out.

In the README.Atlas file, it shows a speedup from ordinary R blas to
atlas3gf-base that is substantial, and then about the same percentage
improvement after upgrading to atlas-sse2.  So if I can figure how to
set the Ubuntu repositories to get their version of sse2, I'll test.

In the meanwhile, I downloaded the Gotoblas2 code, version 1.13 from
the U Texas site (http://www.tacc.utexas.edu/tacc-projects/) .  I
can't give you that file because you are not supposed to redistribute
it, but you can sign up and get for free.  Its easy to build.  I just
ran their script "quickbuild.64bit" and 5 minutes later out popped a
shared library.  After removing libatlas3gf-base package (just to be
sure), I did this:

$ sudo cp libgoto2_penrynp-r1.13.so /usr/lib64
$ cd /usr/lib64
$ sudo mv libblas.so.3gf.0  libblas.so.3gf.0-orig
$ sudo ln -sf libgoto2_penrynp-r1.13.so libblas.so.3gf.0

Now look at my time:

 mm <- matrix(rnorm(10^7), ncol = 10^3)
>
> system.time(crossprod(mm))
   user  system elapsed
  1.140   0.040   0.592

WOW!  Almost 2x as fast as Atlas3gf-base, many orders of magnitude
faster than Ubuntu's default libblas3gf.

The only downside here is that I've blocked all the other users in the
system from using libblas3gf that they usually expect.  I should
probably just bother the R users.

That's where libRblas.so comes into the picture. In order to leave
libblas3gf.so.0 unchanged, I found it is valuable to install R with
the --enable-BLAS-shlib option.  That creates libRblas.so, which R
finds like so:


$ ldd /usr/lib/R/bin/exec/R
        linux-vdso.so.1 =>  (0x00007fffe49c0000)
        libR.so => /usr/lib/libR.so (0x00007f4137331000)
        libRblas.so => /usr/lib/libRblas.so (0x00007f413712d000)

Replace libRblas.so with a sym link to Atlas or Gotoblas2 shared
object files and all is done.

If I see any significant differences on our Fedora or RedHat/Centos
systems, I'll let you know.

pj
-- 
Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas



More information about the R-SIG-Debian mailing list