[Bioc-devel] interesting edge case with Rhtslib linkage
Aaron Lun
alun at wehi.edu.au
Tue Apr 18 16:48:35 CEST 2017
Hi all,
I encountered an interesting edge case involving Rhtslib linkage today,
when I tried to use csaw on my institute's cluster. Package installation
proceeded without a hitch, but running "library(csaw)" failed to load
"csaw.so" due to a failure in finding "libhts.so.0".
My problem stems from the fact that the home drive is mounted in
separate locations between the headnode (where I install stuff) and the
cluster nodes (where the submitted jobs actually get run):
~ on headnode: /cstHome/home/jmlab/
~ on cluster nodes: /home/jmlab/
This normally doesn't cause any issues because the headnode provides a
softlink "/home", which points to "/cstHome/home". For end users or
programs, this makes it seem as if the locations are the same. Indeed,
my R installation thinks R_HOME is "/home/jmlab/software/R/devel", which
is a valid path on both the headnode and cluster nodes. I usually have
no problems running the same R code on either node.
However, Rhtslib::pkgconfig() calls system.file(), which calls
.libPaths(), which performs path normalization to obtain the full file
path without softlinks. This means that, upon installation on the
headnode, "csaw.so" is linked to
"/cstHome/home/jmlab/...etc.../library/Rhtlib/lib/libhts.so". This is
fine when running jobs on the headnode, but fails with "file/directory
not found" for "libhts.so.0" on the cluster node because the
"/cstHome/..." path does not exist there.
So, the crux of the problem is that system.file() does not respect the
soft links that are masking the "true" mount locations of the drives. In
contrast, running R.home() gives me the expected
"/home/jmlab/software/R/devel", with the softlink intact. Changing zzz.R
to use:
system.file(..., lib.loc=R.home("library"))
... in the definition of Rhtslib::pkgconfig() fixes the problem.
Is there a better solution than what I've done? I don't know whether
preservation of softlinks in full file paths is a desirable thing to do
in general, though I would have thought that if it's good enough for
R.home(), it's probably also good enough for system.file().
Cheers,
Aaron
P.S. I should point out that the other obvious solution is to install
csaw from the cluster nodes, such that the path that gets used by
Rhtslib::pkgconfig() is "/home/jmlab/...". However, at least on this
system, the cluster nodes don't have write access to "/home/jmlab/...".
Storing the R installation on the lustre file system (which can be
written) would result in intolerably slow loading times.
More information about the Bioc-devel
mailing list