[Bioc-devel] interesting edge case with Rhtslib linkage

Aaron Lun alun at wehi.edu.au
Tue Apr 18 16:48:35 CEST 2017

Hi all,

I encountered an interesting edge case involving Rhtslib linkage today, 
when I tried to use csaw on my institute's cluster. Package installation 
proceeded without a hitch, but running "library(csaw)" failed to load 
"csaw.so" due to a failure in finding "libhts.so.0".

My problem stems from the fact that the home drive is mounted in 
separate locations between the headnode (where I install stuff) and the 
cluster nodes (where the submitted jobs actually get run):

~ on headnode: /cstHome/home/jmlab/
~ on cluster nodes: /home/jmlab/

This normally doesn't cause any issues because the headnode provides a 
softlink "/home", which points to "/cstHome/home". For end users or 
programs, this makes it seem as if the locations are the same. Indeed, 
my R installation thinks R_HOME is "/home/jmlab/software/R/devel", which 
is a valid path on both the headnode and cluster nodes. I usually have 
no problems running the same R code on either node.

However, Rhtslib::pkgconfig() calls system.file(), which calls 
.libPaths(), which performs path normalization to obtain the full file 
path without softlinks. This means that, upon installation on the 
headnode, "csaw.so" is linked to 
"/cstHome/home/jmlab/...etc.../library/Rhtlib/lib/libhts.so". This is 
fine when running jobs on the headnode, but fails with "file/directory 
not found" for "libhts.so.0" on the cluster node because the 
"/cstHome/..." path does not exist there.

So, the crux of the problem is that system.file() does not respect the 
soft links that are masking the "true" mount locations of the drives. In 
contrast, running R.home() gives me the expected 
"/home/jmlab/software/R/devel", with the softlink intact. Changing zzz.R 
to use:

system.file(..., lib.loc=R.home("library"))

... in the definition of Rhtslib::pkgconfig() fixes the problem.

Is there a better solution than what I've done? I don't know whether 
preservation of softlinks in full file paths is a desirable thing to do 
in general, though I would have thought that if it's good enough for 
R.home(), it's probably also good enough for system.file().



P.S. I should point out that the other obvious solution is to install 
csaw from the cluster nodes, such that the path that gets used by 
Rhtslib::pkgconfig() is "/home/jmlab/...". However, at least on this 
system, the cluster nodes don't have write access to "/home/jmlab/...". 
Storing the R installation on the lustre file system (which can be 
written) would result in intolerably slow loading times.

More information about the Bioc-devel mailing list