[Bioc-devel] Submitting a package with heavy data and vignette

Martin Morgan mtmorg@n@b|oc @end|ng |rom gm@||@com
Tue Nov 26 14:59:40 CET 2019


rather than aiming for an a large ExperimentData package it might make more sense to create an ExperimentHub package, with the data hosted in the cloud for download-on-demand. It is cached locally so the download cost is only paid once. This is especially useful if your data consist of several sets, and only one is needed for the purposes of the vignette. In general it seems like a better strategy, since it makes it easier on mirrors (and our git server) to host the package.

http://bioconductor.org/packages/devel/bioc/vignettes/ExperimentHub/inst/doc/CreateAnExperimentHubPackage.html

I wanted to mention though that *many* authors have said 'my data is too big and I can't do a realistic vignette', only to in the long run come up with a real-enough example that exercises their package. This is tremendously valuable to the user, who can walk through tough areas of package functionality illustrated in the vignette, without having to invest excessive compute time.

Martin

On 11/26/19, 8:54 AM, "Bioc-devel on behalf of Turaga, Nitesh" <bioc-devel-bounces using r-project.org on behalf of Nitesh.Turaga using RoswellPark.org> wrote:

    Hi,
    
    I think this is a good path forward.  Please take a look at the link below which will provide further guidelines for you,
     
    http://bioconductor.org/developers/package-guidelines/#data
    
    https://bioconductor.org/developers/package-submission/#experPackage
    
    https://github.com/Bioconductor/Contributions/blob/master/CONTRIBUTING.md#submitting-related-packages
    
    Best regards,
    
    Nitesh 
    
    On 11/26/19, 8:25 AM, "Bioc-devel on behalf of Joris Meys" <bioc-devel-bounces using r-project.org on behalf of Joris.Meys using UGent.be> wrote:
    
        Dear,
        
        
        we're planning on submitting a new package to Bioconductor. Due to the fact that this package revolves around simulation methods for massive datasets, the vignette necessarily need about 10 Mb of data and way more than 5 minutes to build. We were wondering how we would proceed best to submit this package. Downsizing the data and build time is alas not possible, as it would make the example in the vignette totally irrelevant.
        
        
        I was thinking about the following construct:
        
        - a main software package with the actual simulation functionality
        
        - a "data" package depending on the main software package with only the example data and vignette.
        
        
        We would love to hear your view on this, as we'd like to limit the amount of issues for both you and us once we submit the package(s). Other suggestions are more than welcome too.
        
        
        Thank you in advance
        
        Joris
        
        
        --
        Joris Meys
        Statistical consultant
        
        Department of Data Analysis and Mathematical Modelling
        Ghent University
        Coupure Links 653, B-9000 Gent (Belgium)
        ------------------------------
        
        Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
        
        
        	[[alternative HTML version deleted]]
        
        _______________________________________________
        Bioc-devel using r-project.org mailing list
        https://stat.ethz.ch/mailman/listinfo/bioc-devel
        
    
    
    
    This email message may contain legally privileged and/or confidential information.  If you are not the intended recipient(s), or the employee or agent responsible for the delivery of this message to the intended recipient(s), you are hereby notified that any disclosure, copying, distribution, or use of this email message is prohibited.  If you have received this message in error, please notify the sender immediately by e-mail and delete this email message from your computer. Thank you.
    _______________________________________________
    Bioc-devel using r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/bioc-devel
    


More information about the Bioc-devel mailing list