[Rd] [RFC] A case for freezing CRAN

Jeroen Ooms jeroen.ooms at stat.ucla.edu
Tue Mar 18 21:24:46 CET 2014

This came up again recently with an irreproducible paper. Below an
attempt to make a case for extending the r-devel/r-release cycle to
CRAN packages. These suggestions are not in any way intended as
criticism on anyone or the status quo.

The proposal described in [1] is to freeze a snapshot of CRAN along
with every release of R. In this design, updates for contributed
packages treated the same as updates for base packages in the sense
that they are only published to the r-devel branch of CRAN and do not
affect users of "released" versions of R. Thereby all users, stacks
and applications using a particular version of R will by default be
using the identical version of each CRAN package. The bioconductor
project uses similar policies.

This system has several important advantages:

## Reproducibility

Currently r/sweave/knitr scripts are unstable because of ambiguity
introduced by constantly changing cran packages. This causes scripts
to break or change behavior when upstream packages are updated, which
makes reproducing old results extremely difficult.

A common counter-argument is that script authors should document
package versions used in the script using sessionInfo(). However even
if authors would manually do this, reconstructing the author's
environment from this information is cumbersome and often nearly
impossible, because binary packages might no longer be available,
dependency conflicts, etc. See [1] for a worked example. In practice,
the current system causes many results or documents generated with R
no to be reproducible, sometimes already after a few months.

In a system where contributed packages inherit the r-base release
cycle, scripts will behave the same across users/systems/time within a
given version of R. This severely reduces ambiguity of R behavior, and
has the potential of making reproducibility a natural part of the
language, rather than a tedious exercise.

## Repository Management

Just like scripts suffer from upstream changes, so do packages
depending on other packages. A particular package that has been
developed and tested against the current version of a particular
dependency is not guaranteed to work against *any future version* of
that dependency. Therefore, packages inevitably break over time as
their dependencies are updated.

One recent example is the Rcpp 0.11 release, which required all
reverse dependencies to be rebuild/modified. This updated caused some
serious disruption on our production servers. Initially we refrained
from updating Rcpp on these servers to prevent currently installed
packages depending on Rcpp to stop working. However soon after the
Rcpp 0.11 release, many other cran packages started to require Rcpp >=
0.11, and our users started complaining about not being able to
install those packages. This resulted in the impossible situation
where currently installed packages would not work with the new Rcpp,
but newly installed packages would not work with the old Rcpp.

Current CRAN policies blame this problem on package authors. However
as is explained in [1], this policy does not solve anything, is
unsustainable with growing repository size, and sets completely the
wrong incentives for contributing code. Progress comes with breaking
changes, and the system should be able to accommodate this. Much of
the trouble could have been prevented by a system that does not push
bleeding edge updates straight to end-users, but has a devel branch
where conflicts are resolved before publishing them in the next

## Reliability

Another example, this time on a very small scale. We recently
discovered that R code plotting medal counts from the Sochi Olympics
generated different results for users on OSX than it did on
Linux/Windows. After some debugging, we narrowed it down to the XML
package. The application used the following code to scrape results
from the Sochi website:

XML::readHTMLTable("http://www.sochi2014.com/en/speed-skating", which=2, skip=1)

This code was developed and tested on mac, but results in a different
winner on windows/linux. This happens because the current version of
the XML package on CRAN is 3.98, but the latest mac binary is 3.95.
Apparently this new version of XML introduces a tiny change that
causes html-table-headers to become colnames, rather than a row in the
matrix, resulting in different medal counts.

This example illustrates that we should never assume package versions
to be interchangeable. Any small bugfix release can have side effects
altering results. It is impossible to protect code against such
upstream changes using CMD check or unit testing. All R scripts and
packages are really only developed and tested for a single version of
their dependencies. Assuming anything else makes results
untrustworthy, and code unreliable.

## Summary

Extending the r-release cycle to CRAN seems like a solution that would
be easy to implement. Package updates simply only get pushed to the
r-devel branches of cran, rather than r-release and r-release-old.
This separates development from production/use in a way that is common
sense in most open source communities. Benefits for R include:

- Regular R users (statisticians, researchers, students, teachers) can
share their homemade scripts/documents/packages and rely on them to
work and produce the same results within a given version of R, without
manual efforts to manage package versions.

- Package authors can publish breaking changes to the devel branch
without causing major disruption or affecting users and/or
maintainers. Authors of depending packages have a timeframe to sync
their package with upstream changes before the next release.

- CRAN maintainers can focus quality control and testing efforts on
the devel branch around the time of the code freeze. No need for
crisis management when a package update introduces some severe
breaking changes. Users of released versions are unaffected.

[1] http://journal.r-project.org/archive/2013-1/ooms.pdf

