[Bioc-devel] PPA with built bioconductor packages (for continuous integration)

Dan Tenenbaum dtenenba at fredhutch.org
Mon Nov 10 19:04:38 CET 2014



----- Original Message -----
> From: "Laurent Gautier" <lgautier at gmail.com>
> To: "Martin Morgan" <mtmorgan at fredhutch.org>
> Cc: bioc-devel at r-project.org, "Dan Tenenbaum" <dtenenba at fredhutch.org>
> Sent: Monday, November 10, 2014 9:57:00 AM
> Subject: Re: [Bioc-devel] PPA with built bioconductor packages (for continuous integration)
> 
> 
> 
> They would work in the context of well defined system such as the VM
> used by popular continuous integration providers (Travis or Drone
> for example).
> 
> Then it would be easy as having the binaries built as artifacts by
> continuous integration and made available to other continuous
> integration processes.

This sounds to me like a pretty good use case for docker/rocker. We just need to define what packages should be installed on a given image; I don't think we want the images to be too big (unlike the AMI). The images could be rebuilt daily. So you'd still need to download the diffs from the previous image but I imagine this would take less time than building those packages from source.

Dan




> On Nov 10, 2014 6:19 PM, "Martin Morgan" < mtmorgan at fredhutch.org >
> wrote:
> 
> 
> On 11/09/2014 11:06 AM, Dan Tenenbaum wrote:
> 
> 
> 
> 
> ----- Original Message -----
> 
> 
> From: "Martin Morgan" < mtmorgan at fredhutch.org >
> To: "Laurent Gautier" < lgautier at gmail.com >,
> bioc-devel at r-project.org
> Sent: Sunday, November 9, 2014 8:26:48 AM
> Subject: Re: [Bioc-devel] PPA with built bioconductor packages (for
> continuous integration)
> 
> On 11/09/2014 07:23 AM, Laurent Gautier wrote:
> 
> 
> Hi,
> 
> Continuous integration is a convenient way to automate some of the
> steps
> necessary to ensure quality software.
> 
> Popular ways to do it create a vanilla virtual machine 9VM) with a
> Linux
> distribution, and scripts prepares the VM with 3rd-party
> dependencies
> required by the software. For example, the popular CI system Travis
> for
> github creates by default a VM running ubuntu, and dependencies can
> be
> installed with `apt-get install`.
> 
> When developing software that requires CRAN/bioconductor, the
> latest R is
> available precompiled but the R packages must be downloaded
> installed from
> source.
> 
> This can take a relatively long time. On a recent project over 80%
> of the
> time is spent downloading/installing the R/BioC packages. The
> remaining is
> building the code and running the unit tests.
> 
> Having a Personal Package Archive (PPA) with bioconductor packages
> already
> compiled would both speed up the process and make the use of
> continuous
> integration by projects relying on bioconductor packages easier.
> 
> Is this something others would like to have, and is this something
> that
> bioconductor would see to its mission to provide / help provide
> quality
> software and be able to host ?
> 
> It would be interesting to catalogue objectives (e.g., development
> vs.
> reproducibility) and available alternatives (e.g., PPA, docker /
> Rocker, AMI,
> existing or possible cloud services [such as the Bioc 'single package
> builder'
> used to build and check new package submissions, or travis itself],
> the Becker
> repository management scheme Michael and Gabe mention, ...);
> 
> 
> Just to add to the mix of options, it's possible to run
> R CMD INSTALL --build on a source tarball on Linux and it will create
> a 'binary' version that is already compiled.
> 
> These binaries are in general not portable, either within or between
> distributions, e.g., because the user has a different version of a
> system dependency than the one the binary was built against.
> 
> Martin
> 
> 
> 
> The problem with this is (AFAIK) there is no corresponding package
> type that can be used with install.packages();
> otherwise the simplest solution would be to add a CRAN-style repos
> containing these "binaries". Maybe R could be patched to allow this?
> But it's possible that the requirements for Linux "binaries" could
> vary depending on many things: cpu type (intel or solaris, or...),
> architecture (i386, x64), presence/absence of BLAS/LAPACK, etc etc
> etc. This suggests that a vm or container-based approach might be
> better.
> 
> Dan
> 
> 
> 
> 
> 
> 
> if there
> is a clear
> path forward satisfying some plurality of users without too many
> technical
> obstacles then it might fall within the Bioc purview; my initial
> sense is that
> there is not a consensus on use cases or viable implementations, but
> I can be
> convinced otherwise...
> 
> In terms of Tim's post, getting your colleague to use a PPA /
> existing
> alternative (e.g., the Bioc AMI,
> http://bioconductor.org/help/ bioconductor-cloud-ami/ which comes
> with
> Rstudio
> server installed...) is not likely to be easier / faster than getting
> them to
> download / install relevant R / Bioc packages. One interesting
> possibility is a
> 'hosted' bioconductor with sufficient computational resources on the
> back-end
> and Rstudio server on the front end; this is not impossible to
> imaging seeking
> funding for.
> 
> 
> 
> 
> 
> 
> Martin
> 
> 
> 
> 
> Best,
> 
> Laurent
> 
> [[alternative HTML version deleted]]
> 
> ______________________________ _________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/ listinfo/bioc-devel
> 
> 
> 
> --
> Computational Biology / Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N.
> PO Box 19024 Seattle, WA 98109
> 
> Location: Arnold Building M1 B861
> Phone: (206) 667-2793
> 
> ______________________________ _________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/ listinfo/bioc-devel
> 
> 
> 
> --
> Computational Biology / Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N.
> PO Box 19024 Seattle, WA 98109
> 
> Location: Arnold Building M1 B861
> Phone: (206) 667-2793
>



More information about the Bioc-devel mailing list