[Bioc-devel] R cmd check time limits for BioConductor
Herve Pages
hpages at fhcrc.org
Tue Jun 10 20:00:26 CEST 2008
Hi,
Robert Gentleman wrote:
[...]
> The time limits were instituted because the build system was unable to
> complete within 24 hours (and the math is pretty simple, there are over
> 260 packages, so we need to build more than 10 per hour to be done every
> day).
Just to clarify:
- We build BioC devel *and* BioC release every day.
- Some build machines are running both builds (devel and release) so at most
12 hours can be spent on each build (the devel builds run from noon to midnight
and the release builds from midnight to noon, Seattle time).
- The builds are parallelized i.e. up to 4 'R CMD check' processes can run
simultaneously on the same build machine at any given time. As a consequence,
an entire build run (250-270 packages) takes between 6 and 11 hours
on each build machine (64-bit Linux like wilson1-2 are the fastest).
Parallelization is the only way an entire build run can be done in less
than 12 hours on all the machines.
- Note that 'R CMD check' is not the only command that is executed for each
package. The build stages are: (a) install the dependencies, (b) run 'R CMD build',
(c) run 'R CMD check' and (d) build the binary package (on Windows and Mac OS X
only).
During the same build run, a lot of CPU cycles are wasted because the same
thing can be computed several times. For example each vignette is tested twice:
the 1st time by 'R CMD build' and the 2nd time by 'R CMD check'. We could easily
avoid this by running 'R CMD check --no-vignettes': that would probably make
the builds 10%-30% faster without compromising the current testing paradigm.
Other things are done several times (like installing the exact same package 2
or 3 times, even 4 times in some rare situations) but trying to avoid this
would more complicated.
Cheers,
H.
More information about the Bioc-devel
mailing list