[R-pkg-devel] Submission to CRAN when package needs personal data (API key)

Fri Sep 7 19:10:18 CEST 2018

On 07/09/2018 3:09 AM, Rainer Krug wrote:
> 
> 
>> On 7 Sep 2018, at 02:16, Duncan Murdoch <murdoch.duncan using gmail.com 
>> <mailto:murdoch.duncan using gmail.com>> wrote:
>>
>> On 06/09/2018 10:32 AM, Hadley Wickham wrote:
>>> On Wed, Sep 5, 2018 at 3:03 PM Duncan Murdoch 
>>> <murdoch.duncan using gmail.com <mailto:murdoch.duncan using gmail.com>> wrote:
>>>>
>>>> On 05/09/2018 2:20 PM, Henrik Bengtsson wrote:
>>>>> I take a complementary approach; I condition on, my home-made,
>>>>> R_TEST_ALL variable.  Effectively, I do:
>>>>>
>>>>> if (as.logical(Sys.getenv("R_TEST_ALL", "FALSE"))) {
>>>>>     ...
>>>>> }
>>>>>
>>>>> and set R_TEST_ALL=TRUE when I want to run that part of the test.  You
>>>>> can also imagine refined versions of this, e.g. R_TEST_SETS=foo,bar
>>>>> and test scripts with:
>>>>>
>>>>> if ("foo" %in% strsplit(Sys.getenv("R_TEST_SETS"), split="[, 
>>>>> ]+")[[1]]) {
>>>>>     ...makes no assumption
>>>>> }
>>>>>
>>>>> That avoids making assumptions on where the tests are submitted/run,
>>>>> may it be CRAN, Bioconductor, Travis CI, ...
>>>>
>>>> This is the right way to do it.
>>> I would like to gently push back on this assertion: if CRAN set an
>>> environment variable we would have one single convention that all
>>> packages could rely on.
>>
>> When packages delete tests just for CRAN, the quality of the 
>> repository suffers. 
> 
> Absolutely. But in some cases. But t the moment, one is forced to use 
> workarounds if test **can** not be run on CRAN (API keys, computing 
> times, …) but should be run on local tests. It would make much more 
> sense if there would be a standardised way of dealing with this.
> 
> 
>> Users should be able to check an install by running the tests that 
>> passed on CRAN and seeing them pass on their system as well.
> 
> Also agreed - so if the user sets the environmental variable CRAN for 
> the test, the CRAN tests are executed (as today), if not set, the 
> extended tests are executed.
> 
> 
>>
>> The current system relies on each package
>>> author evolving their own solution. This makes life difficult when you
>>> are running local reverse dependency checks: there is no way to
>>> systematically assert that you want to run tests in a way as similar
>>> as possible to CRAN.
>>
>> Most packages don't need to evolve anything:  the CRAN tests are 
>> sufficient.
> 
> But there seems to be a need to exclude certain tests, due to various 
> reasons.

That need doesn't just apply to CRAN, it applies to anyone running them 
who doesn't have an API key.  So why not leave those tests out by 
default, with a documented way to enable them?

> 
>>
>>> I know that the CRAN maintainers already have a very large workload,
>>> and I hate to add to it, but setting CRAN=1 in a few profile files
>>> doesn't seem excessively burdensome.
>>
>> It would be easy to do that, but then CRAN wouldn't be testing the 
>> same things that users would test. 
> 
> See my comment above.
> 
>> A user might see a test failure that didn't happen on CRAN, and 
>> suspect that there was something wrong with their install, when in 
>> fact it was an author trying to hide a deficiency in their package 
>> from CRAN.
> 
> Only if they execute the extended tests. I can still hide deficiencies 
> in my package by not applying a specific test or doctoring the result, 
> if that is my intention. But the extended tests could be used to test 
> additional setup options, which can not be tested on CRAN.
> 
> 
>>
>>
>>>> This discussion has come up before.  If you want to submit to CRAN, you
>>>> should include tests that satisfy their requests.  If you want even more
>>>> tests, there are several ways to add them in addition to the CRAN tests.
>>>>   Henrik's is one, "R CMD check --test-dir=myCustomTests" is another.
>>>>
>>>> Rainer's package is unusual, in that from his description it can't
>>>> really work unless the user obtains an API key.  There are other
>>>> packages like that, and those cases need manual handling by CRAN:  they
>>>> don't really run full tests by default.  But the vast majority of
>>>> packages should be able to live within the CRAN guidelines.
>>> 10 years ago, I would have definitely supported this statement. But I
>>> am not sure it is still correct today, as there are now many packages
>>> that require a connection to web API to work (or depend on a package
>>> that uses an API). Additionally, CRAN only allows a limited amount of
>>> compute time for each check, so there are often longer tests that you
>>> want to run locally but not on CRAN. CRAN is a specialised testing
>>> service and it does have different constraints to your local machine,
>>> travis, and bioconductor.
>>> A quick search of the CRAN mirror on github
>>> (https://github.com/search?q=org%3Acran+skip_on_cran&type=Code)
>>> reveals that there are ~2700 tests that use testthat::skip_on_cran().
>>> This is obviously an underestimate of the total number of tests
>>> skipped on CRAN, as many packages don't use testthat, or use an
>>> alternative technique to avoid running code on CRAN.
>>
>> That's not so obviously an underestimate, as packages that use that 
>> technique use it many times, not just once per package.  (A sample I 
>> looked at averaged 15 calls per package, but I don't know if that's 
>> unbiased.)
>>
>> But in any case, the skip_on_cran() function implements a version of 
>> Henrik's approach.  The name of the function is misleading, it doesn't 
>> attempt to distinguish between CRAN and a regular user.
> 
> I would guess because it can’t. If there would be a standardised way of 
> identifying that the test is run on CRAN, I would use this immediately.

Then your package would fail when I ran the tests, because I don't have 
an API key, and I am not CRAN.  It makes more sense to me to treat CRAN 
the same as any other user who is not the author.

Duncan Murdoch

> 
> Cheers,
> 
> Rainer
> 
> 
>>
>> Duncan Murdoch
>>
>> ______________________________________________
>> R-package-devel using r-project.org 
>> <mailto:R-package-devel using r-project.org>mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
> 
> --
> Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation 
> Biology, UCT), Dipl. Phys. (Germany)
> 
> University of Zürich
> 
> Cell:       +41 (0)78 630 66 57
> email:      Rainer using krugs.de <mailto:Rainer using krugs.de>
> Skype:      RMkrug
> 
> PGP: 0x0F52F982
> 
> 
>