[R-pkg-devel] Submission to CRAN when package needs personal data (API key)

Thu Sep 6 16:32:39 CEST 2018

On Wed, Sep 5, 2018 at 3:03 PM Duncan Murdoch <murdoch.duncan using gmail.com> wrote:
>
> On 05/09/2018 2:20 PM, Henrik Bengtsson wrote:
> > I take a complementary approach; I condition on, my home-made,
> > R_TEST_ALL variable.  Effectively, I do:
> >
> > if (as.logical(Sys.getenv("R_TEST_ALL", "FALSE"))) {
> >     ...
> > }
> >
> > and set R_TEST_ALL=TRUE when I want to run that part of the test.  You
> > can also imagine refined versions of this, e.g. R_TEST_SETS=foo,bar
> > and test scripts with:
> >
> > if ("foo" %in% strsplit(Sys.getenv("R_TEST_SETS"), split="[, ]+")[[1]]) {
> >     ...makes no assumption
> > }
> >
> > That avoids making assumptions on where the tests are submitted/run,
> > may it be CRAN, Bioconductor, Travis CI, ...
>
> This is the right way to do it.

I would like to gently push back on this assertion: if CRAN set an
environment variable we would have one single convention that all
packages could rely on. The current system relies on each package
author evolving their own solution. This makes life difficult when you
are running local reverse dependency checks: there is no way to
systematically assert that you want to run tests in a way as similar
as possible to CRAN.

I know that the CRAN maintainers already have a very large workload,
and I hate to add to it, but setting CRAN=1 in a few profile files
doesn't seem excessively burdensome.

> This discussion has come up before.  If you want to submit to CRAN, you
> should include tests that satisfy their requests.  If you want even more
> tests, there are several ways to add them in addition to the CRAN tests.
>   Henrik's is one, "R CMD check --test-dir=myCustomTests" is another.
>
> Rainer's package is unusual, in that from his description it can't
> really work unless the user obtains an API key.  There are other
> packages like that, and those cases need manual handling by CRAN:  they
> don't really run full tests by default.  But the vast majority of
> packages should be able to live within the CRAN guidelines.

10 years ago, I would have definitely supported this statement. But I
am not sure it is still correct today, as there are now many packages
that require a connection to web API to work (or depend on a package
that uses an API). Additionally, CRAN only allows a limited amount of
compute time for each check, so there are often longer tests that you
want to run locally but not on CRAN. CRAN is a specialised testing
service and it does have different constraints to your local machine,
travis, and bioconductor.

A quick search of the CRAN mirror on github
(https://github.com/search?q=org%3Acran+skip_on_cran&type=Code)
reveals that there are ~2700 tests that use testthat::skip_on_cran().
This is obviously an underestimate of the total number of tests
skipped on CRAN, as many packages don't use testthat, or use an
alternative technique to avoid running code on CRAN.

Hadley

-- 
http://hadley.nz