[Bioc-devel] R cmd check time limits for BioConductor

Tue Jun 10 19:20:36 CEST 2008

Hi Kevin,

Kevin R. Coombes wrote:
> Hi,
> 
> I have considered that possibility, but am not yet convinced that it is 
> the best approach. I will, of course, do something like that if I cannot 
> persuade this list that an alternative approach might be better. The 

Hi Kevin,
   These are good points, and not ones that are somehow suddenly new. 
The issues are ones we grapple with, and there are many different 
solutions that developers have taken.

   First, I think that there may be a misconception in play. The build 
system is essentially just that: a build system.  We don't have the 
resources to provide a comprehensive testing resource for developers. We 
do minimal checks and push out the packages that pass those in as timely 
a fashion as possible. We do expect that developers take the steps you 
are describing, and before committing code to BioC that they have been 
careful to run all appropriate testing they deem necessary, but I don't 
see where there is, or should be, a reliance on having the testing done 
every day on four (or more) platforms, or being done by Bioconductor at 
all.  I think that is one of the developer's responsibilities, not the 
project's.

   The time limits were instituted because the build system was unable 
to complete within 24 hours (and the math is pretty simple, there are 
over 260 packages, so we need to build more than 10 per hour to be done 
every day).  And as we upgrade equipment, we are hoping to be able to 
keep the guidelines as they are.  Other options are to allow longer 
tests but have longer delays before packages are ready. My impression is 
that most developers would rather have their code available sooner, but 
I would appreciate hearing alternative points of view.  Perhaps this 
topic can be visited during the developer day at BioC2008.

   I strongly encourage you to do the testing you believe is 
appropriate.  I will mention that the basic checks for all of R (and 
there are two levels of testing even there) would run within the time 
frame we have asked package maintainers to meet.

   best wishes
     Robert

> basic argument is:
> 
> * Complex algorithms can be better maintained if they are accompanied by 
> regression testing.
> * "R CMD check" provides an automated method to run regression tests, 
> with a defined directory structure for storing those tests.
> * Changing the directory location in the source makes running the 
> regression tests more awkward and thus less likely to occur on a regular 
> basis.
> * The "--no-tests" argument already provides a mechanism for preventing 
> the tests from being run.
> 
> What appears to be missing is either a mechanism to designate the tests 
> as optional or to indicate a preference for not running some or all of 
> them. I can think of three ways to accomplish my goals in this matter:
> 
> [1] Make "--no-tests" the default way to run "R CMD check" at 
> BioConductor. (Of course, this is unlikely to be the optimal solution 
> since it merely avoids the question.)
> [2] Add a field to the DESCRIPTION file that tells "R CMD check" whether 
> or not to run the tests. Something like
>     Tests: run
> or
>     Tests: dontrun
> [3] Add an optional special file in the tests directory that indicates 
> the complexity/length of the tests that would allow "R CMD check" to 
> decide whether or not to run them. Perhaps something like
> 
> ###################
> # COMPLEXITY file
> 
> test1.R: long
> test2.R: short
> ...
> ###################
> 
> Of course, options [2] or [3] require changes to "R CMD check" (for 
> which I should eventually move this discussion to the R-devel list), but 
> I am really only interested in convincing BioConductor that (possibly 
> complex) regression tests are a good thing, and should be encouraged by 
> adopting something like [1].
> 
> Best,
>     Kevin
> 
> Laurent Gautier wrote:
>> 2008/6/10 Kevin R. Coombes <krcoombes at mdacc.tmc.edu>:
>>> Hi,
>>>
>>> The BioConductor package guidelines say that a package should take 
>>> less than
>>> five minutes to run "R CMD check". I have a package that is almost 
>>> ready to
>>> submit; however, it currently includes nontrivial regression testing 
>>> in the
>>> "tests" subdirectory. With the tests, the time for "R CMD check" 
>>> could be
>>> significantly longer than five minutes. Without the tests, the package
>>> easily fits within the time limit.
>>>
>>> [1] I know that I can run "R CMD check --no-tests [PKG]" to prevent the
>>> tests from running when I check the code myself. Is there any way for a
>>> package submitted to BioConductor to indicate that the tests should be
>>> skipped?
>>>
>>> [2] Alternatively, is there an easy way to include the tests so that 
>>> I can
>>> run them whenever I want to make sure I haven't broken the code (too 
>>> badly
>>> ...), but not force everyone else to run them when checking the rest 
>>> of the
>>> structure of the code and documentation?
>>
>> You could consider having them in your package, in a directory
>> inst/tests/ for example
>> (so the tests are still available from an installed package).
>>
>>> Thanks in advance,
>>>    Kevin
>>>
>>> _______________________________________________
>>> Bioc-devel at stat.math.ethz.ch mailing list
>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>
> 
> _______________________________________________
> Bioc-devel at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
> 

-- 
Robert Gentleman, PhD
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
PO Box 19024
Seattle, Washington 98109-1024
206-667-7700
rgentlem at fhcrc.org