[Bioc-devel] interesting edge case with Rhtslib linkage

Martin Morgan martin.morgan at roswellpark.org
Tue Apr 18 19:29:20 CEST 2017


On 04/18/2017 10:48 AM, Aaron Lun wrote:
> Hi all,
>
> I encountered an interesting edge case involving Rhtslib linkage today,
> when I tried to use csaw on my institute's cluster. Package installation
> proceeded without a hitch, but running "library(csaw)" failed to load
> "csaw.so" due to a failure in finding "libhts.so.0".
>
> My problem stems from the fact that the home drive is mounted in
> separate locations between the headnode (where I install stuff) and the
> cluster nodes (where the submitted jobs actually get run):
>
> ~ on headnode: /cstHome/home/jmlab/
> ~ on cluster nodes: /home/jmlab/
>
> This normally doesn't cause any issues because the headnode provides a
> softlink "/home", which points to "/cstHome/home". For end users or
> programs, this makes it seem as if the locations are the same. Indeed,
> my R installation thinks R_HOME is "/home/jmlab/software/R/devel", which
> is a valid path on both the headnode and cluster nodes. I usually have
> no problems running the same R code on either node.
>
> However, Rhtslib::pkgconfig() calls system.file(), which calls
> .libPaths(), which performs path normalization to obtain the full file
> path without softlinks. This means that, upon installation on the
> headnode, "csaw.so" is linked to
> "/cstHome/home/jmlab/...etc.../library/Rhtlib/lib/libhts.so". This is
> fine when running jobs on the headnode, but fails with "file/directory
> not found" for "libhts.so.0" on the cluster node because the
> "/cstHome/..." path does not exist there.
>
> So, the crux of the problem is that system.file() does not respect the
> soft links that are masking the "true" mount locations of the drives. In
> contrast, running R.home() gives me the expected
> "/home/jmlab/software/R/devel", with the softlink intact. Changing zzz.R
> to use:
>
> system.file(..., lib.loc=R.home("library"))
>
> ... in the definition of Rhtslib::pkgconfig() fixes the problem.
>
> Is there a better solution than what I've done? I don't know whether
> preservation of softlinks in full file paths is a desirable thing to do
> in general, though I would have thought that if it's good enough for
> R.home(), it's probably also good enough for system.file().

Two different ideas are (a) to provide an environment variable 
RHTSLIB_RPATH that can override system.file() (I think it's actually 
.libPaths() that is using noramlizePath() and expanding symlinks)

pkgconfig <-
     function(opt = c("PKG_LIBS", "PKG_CPPFLAGS"))
{
     path <- Sys.getenv(
         "RHTSLIB_RPATH",
         system.file("lib", package="Rhtslib", mustWork=TRUE)
     )
     if (nzchar(.Platform$r_arch)) {
...

and (b) to use static rather than dynamic linking, as we do on macOS

...
     result <- switch(match.arg(opt), PKG_CPPFLAGS={
         sprintf('-I"%s"', system.file("include", package="Rhtslib"))
     }, PKG_LIBS={
         switch(Sys.info()['sysname'], Linux={
             sprintf('-%s/libhts.a -lz -pthread', patharch)
         }, Darwin={
...

On balance I think it would be as easy to use static linking, but I'm 
open to other ideas.

Martin

>
> Cheers,
>
> Aaron
>
> P.S. I should point out that the other obvious solution is to install
> csaw from the cluster nodes, such that the path that gets used by
> Rhtslib::pkgconfig() is "/home/jmlab/...". However, at least on this
> system, the cluster nodes don't have write access to "/home/jmlab/...".
> Storing the R installation on the lustre file system (which can be
> written) would result in intolerably slow loading times.
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>


This email message may contain legally privileged and/or...{{dropped:2}}



More information about the Bioc-devel mailing list