[BioC] Installing Bioconductor on Linux...

Mon Oct 4 14:59:02 CEST 2010

On Sun, Oct 3, 2010 at 10:29 PM, Martin Morgan <mtmorgan at fhcrc.org> wrote:
> Hi James --
>
> On 10/03/2010 06:36 PM, James Carman wrote:
>> I've got Bioconductor successfully installed on my linux workstation
>> at work, but we need to install it on our server.  I remember it being
>> particularly difficult to make sure I had all the right packages
>> installed in the O/S.  Is there an easier way?
>
> Others will have different suggestions, but my two cents...
>
> (1) Probably the way to proceed is to install R and biocLite() packages
> as 'root' or similarly privileged account. Base and commonly used
> packages will be stored in a system-wide location. Individual users
> requiring additional packages will say biocLite("OtherPackage") and
> these will be installed in their own user directory as described in
> ?install.packages (which is what biocLite uses to install packages):
>
>     If 'lib' is omitted or is of length
>     one and is not a (group) writable directory, the code offers to
>     create a personal library tree (the first element of
>     'Sys.getenv("R_LIBS_USER")') and install there.
>
> (2) 'Third party' dependencies need to be satisfied in a rational way,
> remembering (a) that R packages have complex dependencies with other R
> packages, and (b) R compiles C (and other) source code and so requires
> header files associated with third party libraries. Combining these, one
> can imagine biocLite("biomaRt") failing because XML fails because RCurl
> fails because the *devel* libcurl (devel required for the curl headers)
> is not installed. This will be indicated in the output of biocLite, but
> will require patient inspection of the output to see this. Part of the
> third party installation process may mean evaluating standard Linux
> commands (e.g., /sbin/ldconfig), setting environment variables
> (LD_LIBRARY_PATH ?), and under worst-case scenarios (Rmpi comes to mind)
> inspecting the R package configure.in / configure.ac script (by
> downloading the source package from CRAN or Bioconductor) to understand
> what the requirements are and how they are supposed to be satisfied.
>
> A final comment is that the next version of R is about to be released
> (scheduled October 15 for R, Oct 18 for Bioconductor), so if you're only
> going to get one opportunity to sit down with your system administrator
> you might want to delay for a couple of weeks. On the other hand it's a
> learning experience and much easier the second time.
>
> The Bioconductor team is interested in developing, over the next year or
> so, a more fool-proof way of distributing Bioconductor, so I encourage
> others to contribute their solutions and experiences.

I have a set of scripts for doing daily updates/compilation of R-devel
that is run in a multi-user, multi-node setting here at Hopkins and
seems to work great now that I have ironed out the various edge cases
over the last months.  I probably spend less than 5 minutes per week
on doing this (unlike in the beginning where I did spend quite a bit
more time).  Of course, we have specific needs and use cases that may
or may not be relevant for others.

I have it on my todo list to make a document/webpage detailing what I
have done.  This will most likely be most useful for people who wants
to have daily updates to R-devel with minimal maintenance.  It does
not address the problem of pulling external dependencies.  There is no
big secret, I just have some amount of useful shell code.

But I certainly won't get around to annotating it the next couple of weeks.

Kasper