[Bioc-devel] Methods to speed up R CMD Check

Martin Morgan mtmorg@n@b|oc @end|ng |rom gm@||@com
Mon Mar 22 14:34:40 CET 2021


if your examples repeatedly calculate the same thing, and this is also typical of how users use your package, it might make sense to 'memoise' key functions in your package https://cran.r-project.org/package=memoise

Martin

On 3/22/21, 7:41 AM, "Bioc-devel on behalf of Kern, Lori" <bioc-devel-bounces using r-project.org on behalf of Lori.Shepherd using RoswellPark.org> wrote:

    If your data is using ExperimentHub,  it should already be caching the downloaded data.  Once it is downloaded once, it should be using the cached download for subsequent calls to the hub.  We will investigate to ensure that the caching mechanism is functioning properly on all of our Bioconductor builders.



    Lori Shepherd

    Bioconductor Core Team

    Roswell Park Comprehensive Cancer Center

    Department of Biostatistics & Bioinformatics

    Elm & Carlton Streets

    Buffalo, New York 14263

    ________________________________
    From: Bioc-devel <bioc-devel-bounces using r-project.org> on behalf of Murphy, Alan E <a.murphy using imperial.ac.uk>
    Sent: Monday, March 22, 2021 5:38 AM
    To: bioc-devel using r-project.org <bioc-devel using r-project.org>
    Subject: [Bioc-devel] Methods to speed up R CMD Check

    Hi all,

    I am working on the development of [EWCE](https://secure-web.cisco.com/1uG0LGgCjdg85VowwaeRHk2fMjXFkOtQWsgL8p2MQD2j2PZFh_tqvJWaCHJfArA8O4B2WLG1JOwn31NISgSrPW3syUdiPlWNi7cHAMCWKZUQ8d9RrlR-d81LDXXx0xtfCI5ZjjTyFS2xxM2tDea27Y51bWk4Y7jpSnC8Bx768AHBeaJAg3YAK_HTxR6hMzFW99X6Pg8bETgPYi92ccneqdgAJcDBIdfwZnd9OMaM4JS0kY9kYT3F58ho2jM_k0n6EqMzhuXl3HEM7uneL7twMxTTxSZ-vFC1U1eFSkAr0sp38AyD3g6gTbf-vUbghaGV-JBKoybZto3ZDmHhs8OE6cQ/https%3A%2F%2Fgithub.com%2FNathanSkene%2FEWCE) but have hit an issue with R CMD check's runtime. I have been informed this test needs to be completed in 15 minutes but mine is currently running in ~24 minutes and I am looking for methods to speed this up. The main culprits for the runtime issue are:

    checking examples (5m 49.8s)
    Running �testthat.R� [308s/469s] (7m 49.1s)
    checking for unstated dependencies in vignettes (7m 49.4s)
    checking re-building of vignette outputs (5m 12s)

    With the exception of using smaller datasets which I will consider myself, is there known ways of speeding these up? EWCE derives data from an Experimenthub package [ewceData](https://secure-web.cisco.com/1r4B8NJkUGCpdQsdBW8RWLwGvwEA9TlvXY7VUYgAKS-TBmT7s-6a3zMLfS6rXRVUUxG4x8SCYzXUXZKYMtZ_ysyEzk56tVxfvju-9mo6l11KLQ7CzEpFMikVqdyT25f0G3SQK5u9b0_5JK2gNhR4l0j_5_b_B-uPxzyFF0jtLCZFHKW2-pD7e2P4RVOfbgRALwBXM-hQvhcoaxxrR8tWz3JLjKxWqNIhTrsJdATsAnUO0EnQ5U8JNXClmS9LvWwyTf-0ZqokYXTkjdfYDUAm6KiAGNJo4oX99GUBQZllyiIDprF07KeqjsMNMg4dbmMh0t6jl-UEiUaV3j1xRG8UyyA/https%3A%2F%2Fgithub.com%2Fneurogenomics%2FewceData) for its examples, tests and vignette. This is run repeatedly and I have noted this takes a significant amount of time to load a dataset. Is there anyway of caching the datasets for all the checks or more generally of speeding this up?

    I have heard of the use of [long tests](http://secure-web.cisco.com/1yfwFXFFfUKBuFTwUeuS8XGYbh53YduG9ZGKMVmVU9Yrgxg4DbKA0_prEIOCNcgc8uANWYzUw115x_8njawa33mjqM5ZBEvTPTJhmXRzttl1eaRVu3Pa0FTA-d-wPRK3Xxa4miiXob79k_exN0isifYlHPTK7WRxh9_LbFye17PwVVOGsfxjEFKi8WF27D6LWJynf8k-L7iEqB2MSDkf_1zWmfA2qJByna147_Jkaa-nLx9FFl4VhsosBoNDE_qnC939XrCLLCT7RgV0jPukrVdahccxXfT6bgtGBR8ZKfj25BoCeE1_hTJXFgGP0CGmegMYqqmsbd3pGTbo63vTW-A/http%3A%2F%2Fbioconductor.org%2Fdevelopers%2Fhow-to%2Flong-tests%2F) which aren't run daily by Bioconductor but are these still checked in R CMD Check? Is there any other way to exclude my tests from the R CMD Check given they aren't a necessity from Bioconductor?

    Does checking for unstated dependencies in vignettes have a long runtime based on the number of package dependencies? If I just export specific functions from packages will this check time reduce?

    Lastly, is there any way to get an exception of the 15 minute maximum? I may be ill-informed but is the max time for packages on Bioconductor's daily check 40 minutes which my code in its current state would complete by.

    Kind regards,
    Alan.


            [[alternative HTML version deleted]]



    This email message may contain legally privileged and/or confidential information.  If you are not the intended recipient(s), or the employee or agent responsible for the delivery of this message to the intended recipient(s), you are hereby notified that any disclosure, copying, distribution, or use of this email message is prohibited.  If you have received this message in error, please notify the sender immediately by e-mail and delete this email message from your computer. Thank you.
    	[[alternative HTML version deleted]]

    _______________________________________________
    Bioc-devel using r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/bioc-devel


More information about the Bioc-devel mailing list