[R-SIG-Mac] slow compiling times with clang 4.0.0 (but faster code?)

Hervé Pagès hpages at fredhutch.org
Sat Mar 25 22:33:41 CET 2017


Hi,

Following the lead of the R folks, we've started to build and check
Bioconductor packages on El Capitan using the compilers that Simon
made available here:

   https://r.research.att.com/libs/

The cran-usr-local-darwin15.6-20170320.tar.gz tarball contains
clang 4.0.0 that is used to compile package C and C++ code.

Our latest daily build report can be found here:

   https://bioconductor.org/checkResults/3.5/bioc-LATEST/

The builder running El Capitan + clang 4.0.0 is veracruz2:

   veracruz2:~ biocbuild$ clang -v
   clang version 4.0.0 (tags/RELEASE_400/final)
   Target: x86_64-apple-darwin15.6.0
   Thread model: posix
   InstalledDir: /usr/local/clang+llvm-4.0.0-x86_64-apple-darwin/bin

toluca2 is another builder that has the same specs than veracruz2
but is running Mavericks and uses the clang compiler from Apple's
Command Line Developer Tools:

   toluca2:~ biocbuild$ clang -v
   Apple LLVM version 6.0 (clang-600.0.57) (based on LLVM 3.5svn)
   Target: x86_64-apple-darwin13.4.0
   Thread model: posix

We're observing significantly slower compiling times on veracruz2
compared to toluca2. For example, looking at packages where
'R CMD INSTALL' time is dominated by compilation:

   683.0 vs 483.6 seconds for the mzR package (C++ code):

 
https://bioconductor.org/checkResults/3.5/bioc-LATEST/mzR/veracruz2-install.html
 
https://bioconductor.org/checkResults/3.5/bioc-LATEST/mzR/toluca2-install.html

   378.3 vs 299.9 seconds for the flowWorkspace package (C++ code):

 
https://bioconductor.org/checkResults/3.5/bioc-LATEST/flowWorkspace/veracruz2-install.html
 
https://bioconductor.org/checkResults/3.5/bioc-LATEST/flowWorkspace/toluca2-install.html

   186.3 vs 126.2 seconds for the rTANDEM package (C++ code):

 
https://bioconductor.org/checkResults/3.5/bioc-LATEST/rTANDEM/veracruz2-install.html
 
https://bioconductor.org/checkResults/3.5/bioc-LATEST/rTANDEM/toluca2-install.html

   109.4 vs 80.8 seconds for the rhdf5 package (C code):

 
https://bioconductor.org/checkResults/3.5/bioc-LATEST/rhdf5/veracruz2-install.html
 
https://bioconductor.org/checkResults/3.5/bioc-LATEST/rhdf5/toluca2-install.html

   etc...

The slowdown seems pretty consistent with a time_on_veracruz2 /
time_on_toluca2 ratio varying between 1.25 and 1.5.

In addition to clang 4.0.0 we also have the clang compiler from
Apple's Command Line Developer Tools on veracruz2:

   veracruz2:sandbox biocbuild$ /usr/bin/clang -v
   Apple LLVM version 8.0.0 (clang-800.0.42.1)
   Target: x86_64-apple-darwin15.6.0
   Thread model: posix
   InstalledDir: /Library/Developer/CommandLineTools/usr/bin

If I switch between the 2 compilers on veracruz2 I observe a similar
slowdown i.e. the time_with_clang_4.0.0 / time_with_Apple_clang ratio
is about the same as the previous ratios. The only difference this
time is the compiler.

I was wondering if anybody else observed this or if this a known
"issue" with clang 4.0.0.

On the other hand the good news is that packages with no native
code seem to build and check slightly faster on veracruz2 than on
toluca2. For example:

https://bioconductor.org/checkResults/3.5/bioc-LATEST/GenomicFeatures/veracruz2-buildsrc.html
https://bioconductor.org/checkResults/3.5/bioc-LATEST/GenomicFeatures/toluca2-buildsrc.html

https://bioconductor.org/checkResults/3.5/bioc-LATEST/GenomicFeatures/veracruz2-checksrc.html
https://bioconductor.org/checkResults/3.5/bioc-LATEST/GenomicFeatures/toluca2-checksrc.html

Note that R was compiled with Apple's clang on toluca2 and with clang
4.0.0 on veracruz2. So is the deal that clang 4.0.0 produces more
efficient code at the cost of longer compilation times?

Cheers,
H.


-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the R-SIG-Mac mailing list