[R-pkg-devel] Submission to CRAN when package needs personal data (API key)

Fri Sep 7 09:09:00 CEST 2018

> On 7 Sep 2018, at 02:16, Duncan Murdoch <murdoch.duncan using gmail.com> wrote:
> 
> On 06/09/2018 10:32 AM, Hadley Wickham wrote:
>> On Wed, Sep 5, 2018 at 3:03 PM Duncan Murdoch <murdoch.duncan using gmail.com> wrote:
>>> 
>>> On 05/09/2018 2:20 PM, Henrik Bengtsson wrote:
>>>> I take a complementary approach; I condition on, my home-made,
>>>> R_TEST_ALL variable.  Effectively, I do:
>>>> 
>>>> if (as.logical(Sys.getenv("R_TEST_ALL", "FALSE"))) {
>>>>     ...
>>>> }
>>>> 
>>>> and set R_TEST_ALL=TRUE when I want to run that part of the test.  You
>>>> can also imagine refined versions of this, e.g. R_TEST_SETS=foo,bar
>>>> and test scripts with:
>>>> 
>>>> if ("foo" %in% strsplit(Sys.getenv("R_TEST_SETS"), split="[, ]+")[[1]]) {
>>>>     ...makes no assumption
>>>> }
>>>> 
>>>> That avoids making assumptions on where the tests are submitted/run,
>>>> may it be CRAN, Bioconductor, Travis CI, ...
>>> 
>>> This is the right way to do it.
>> I would like to gently push back on this assertion: if CRAN set an
>> environment variable we would have one single convention that all
>> packages could rely on.
> 
> When packages delete tests just for CRAN, the quality of the repository suffers.

Absolutely. But in some cases. But t the moment, one is forced to use workarounds if test **can** not be run on CRAN (API keys, computing times, …) but should be run on local tests. It would make much more sense if there would be a standardised way of dealing with this.

> Users should be able to check an install by running the tests that passed on CRAN and seeing them pass on their system as well.

Also agreed - so if the user sets the environmental variable CRAN for the test, the CRAN tests are executed (as today), if not set, the extended tests are executed.

> 
> The current system relies on each package
>> author evolving their own solution. This makes life difficult when you
>> are running local reverse dependency checks: there is no way to
>> systematically assert that you want to run tests in a way as similar
>> as possible to CRAN.
> 
> Most packages don't need to evolve anything:  the CRAN tests are sufficient.

But there seems to be a need to exclude certain tests, due to various reasons.

> 
>> I know that the CRAN maintainers already have a very large workload,
>> and I hate to add to it, but setting CRAN=1 in a few profile files
>> doesn't seem excessively burdensome.
> 
> It would be easy to do that, but then CRAN wouldn't be testing the same things that users would test.

See my comment above.

> A user might see a test failure that didn't happen on CRAN, and suspect that there was something wrong with their install, when in fact it was an author trying to hide a deficiency in their package from CRAN.

Only if they execute the extended tests. I can still hide deficiencies in my package by not applying a specific test or doctoring the result, if that is my intention. But the extended tests could be used to test additional setup options, which can not be tested on CRAN.

> 
> 
>>> This discussion has come up before.  If you want to submit to CRAN, you
>>> should include tests that satisfy their requests.  If you want even more
>>> tests, there are several ways to add them in addition to the CRAN tests.
>>>   Henrik's is one, "R CMD check --test-dir=myCustomTests" is another.
>>> 
>>> Rainer's package is unusual, in that from his description it can't
>>> really work unless the user obtains an API key.  There are other
>>> packages like that, and those cases need manual handling by CRAN:  they
>>> don't really run full tests by default.  But the vast majority of
>>> packages should be able to live within the CRAN guidelines.
>> 10 years ago, I would have definitely supported this statement. But I
>> am not sure it is still correct today, as there are now many packages
>> that require a connection to web API to work (or depend on a package
>> that uses an API). Additionally, CRAN only allows a limited amount of
>> compute time for each check, so there are often longer tests that you
>> want to run locally but not on CRAN. CRAN is a specialised testing
>> service and it does have different constraints to your local machine,
>> travis, and bioconductor.
>> A quick search of the CRAN mirror on github
>> (https://github.com/search?q=org%3Acran+skip_on_cran&type=Code)
>> reveals that there are ~2700 tests that use testthat::skip_on_cran().
>> This is obviously an underestimate of the total number of tests
>> skipped on CRAN, as many packages don't use testthat, or use an
>> alternative technique to avoid running code on CRAN.
> 
> That's not so obviously an underestimate, as packages that use that technique use it many times, not just once per package.  (A sample I looked at averaged 15 calls per package, but I don't know if that's unbiased.)
> 
> But in any case, the skip_on_cran() function implements a version of Henrik's approach.  The name of the function is misleading, it doesn't attempt to distinguish between CRAN and a regular user.

I would guess because it can’t. If there would be a standardised way of identifying that the test is run on CRAN, I would use this immediately.

Cheers,

Rainer

> 
> Duncan Murdoch
> 
> ______________________________________________
> R-package-devel using r-project.org <mailto:R-package-devel using r-project.org> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel <https://stat.ethz.ch/mailman/listinfo/r-package-devel>
--
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, UCT), Dipl. Phys. (Germany)

University of Zürich

Cell:       +41 (0)78 630 66 57
email:      Rainer using krugs.de
Skype:      RMkrug

PGP: 0x0F52F982

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: Message signed with OpenPGP
URL: <https://stat.ethz.ch/pipermail/r-package-devel/attachments/20180907/226b02ce/attachment.sig>