[Bioc-devel] Byte compiling of packages

Martin Morgan mtmorgan at fhcrc.org
Wed Jan 8 17:33:07 CET 2014


On 01/08/2014 07:43 AM, Julian Gehring wrote:
> Hi,
>
> R-2.13 introduced the byte code compiler package 'compiler' [1], which can be
> used to precompile the R code of a package at installation time (using the
> ByteCompile field in the Description file or '--byte-compile' with R CMD
> INSTALL).  I have been using this lately with my own packages, and had some
> notable improvements (and also did not find a case where it decreased the
> performance).  I was wondering if anyone has extensive experience with this.
> Specifically, are there good reasons not to use this by default for packages?

my prejudice (i.e., no actual data) is that improvements from byte compilation 
point to code that could be vectorized (for even greater improvement) instead -- 
for instance, replacing a for loop over a vector with a vectorized alternative

     library(compiler)
     library(microbenchmark)
     f = function(n) { i = 0; for (j in seq_len(n)) i = i + 1; i }
     fc = cmpfun(f)
     v = function(n) { i = integer(n); i[] = 1; sum(i) }
     identical(f(1e5), v(1e5))
     ## [1] TRUE
     microbenchmark(f(1e5), fc(1e5), v(1e5), times=10)
     ## Unit: milliseconds
     ##       expr       min        lq    median        uq       max neval
     ##   f(1e+05) 32.246407 36.119512 37.028699 37.451135 37.894748    10
     ##  fc(1e+05) 13.777743 14.610523 15.151645 15.303884 15.580075    10
     ##   v(1e+05)  2.628862  2.644882  2.803085  2.862608  3.041894    10

The speed-up in the example (2x) seems to be more or less typical of 'good' 
improvement (what is 'noticeable' in your case?) from compiling functions, and 
for lots of cases there can easily be improvements of order(s) of magnitude by 
bringing the code in line with R's design principles.

I think byte compilation is intended to be fully compatible with R, so there 
shouldn't be any 'errors' introduced (and if there were, these would be fixed). 
There is a cost to byte compilation, but this is paid at the time of 
installation rather than loading and is not really noticeable except for large 
packages. If your code is well-written and there are still improvements from 
byte compilation, then it seems reasonable to use byte compilation as a 
package-level setting.

Martin

>
> Best wishes
> Julian
>
> [1] http://dirk.eddelbuettel.com/blog/2011/04/12/
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel


-- 
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioc-devel mailing list