[Rd] Cluster: Various GCC, how important is consistency?

Paul Johnson pauljohn32 at gmail.com
Tue Oct 18 01:44:13 CEST 2016


On a cluster that is based on RedHat 6.2, we are updating to R-3.3.1.
I have, from time to time, run into problems with various R packages
and some older versions of GCC. I wish we had newer Linux in the
cluster, but with 1000s of nodes running 1000s of jobs, well, they
don't want a restart.

Administrator suggested I try to build with the GCC that is provided
with the nodes, which is gcc-4.4.7.  To my surprise, R-3.3.1 compiled
with that.  After that, I got quite far, many 100s of packages
compiled, but then I hit a snag that RccArmadillo explicitly refuses
to build with anything older than gcc-4.6.  The OpenMx package and
emplik packages also refuse to compile with old gcc

The cluster uses a module system, it is easy enough to swap in various
gcc versions to see what compiles.

I did succeed compiling RcppArmadillo with gcc 4.9.2. But Rcpp is not
picky, it compiled with gcc-4.4.7.

I worry...

1)  will reliance on various GCC make the packages incompatible with
R, or each other?

I logged out, logged back in, with R 3.3.1 I can run

library(RcppArmadillo)
library(Rcpp)

with no errors so far. But I'm not stress testing it much.

I should rebuild everything?

I expect that if I were to use gcc-6 on one package, it would not be
compatible with binaries built with 4.4.7.  But is there a zone of
tolerance allowing 4.4.7 and 4.9 packages to coexist?

2) If I build with non-default GCC, are all of the R users going to
hit trouble if they don't have the same GCC I use?  Unless I make some
extraordinary effort, they are getting GCC 4.4.7. If they try to
install a package, they are getting that GCC, not the one I use to
build RcppArmadillo or the other trouble cases (or everything, if you
say I need to go back and rebuild).

>From an administrative point of view, should I tie R-3.3.1 to a
particular version of GCC? I think I could learn how to do that.

On the cluster, they use the module framework. There are about 50
versions of GCC.  It is easy enough ask for a newer one:

$ module load gcc/4.9.2

It puts the gcc 4.9.2 binaries and shared libraries at the front of the PATHs.

pj


-- 
Paul E. Johnson   http://pj.freefaculty.org
Director, Center for Research Methods and Data Analysis http://crmda.ku.edu

To write me directly, address me at pauljohn at ku.edu.



More information about the R-devel mailing list