[R-sig-hpc] Encapsulated R for cluster use
edd at debian.org
Sun Nov 6 02:52:04 CET 2011
On 5 November 2011 at 20:48, Daryl Waggott wrote:
| Very interesting! Thanks Saptarshi. I'll look into this and bring it up at our re-invigorated Hadoop meetup.
What Saptarshi suggest is functionally equivalent to how 'third-party
software' (think: RStudio, Google Chrome, ...) wraps all required libraries
inside their own 'sumo' .deb packages. That is an alternative for you too.
And what Simon suggested (to stick the libraries you need into $R_HOME/lib)
is also just about the same, modulo the wrapping up in a .deb.
You are in essence just reinventing dependency management to create
self-sufficient deployment units.
| From: Saptarshi Guha [saptarshi.guha at gmail.com]
| Sent: Saturday, November 05, 2011 7:53 PM
| To: Dirk Eddelbuettel
| Cc: Daryl Waggott; r-sig-hpc at r-project.org
| Subject: Re: [R-sig-hpc] Encapsulated R for cluster use
| When I use rhipe I package the entire r distribution and send it to the nodes using hadoops distributedcache. This facility takes zip file, sends it to the nodes (only if there time difference) and unzips it. The script makes modifications to the R script and sets ld_library_path.
| Please see http://code.google.com/p/rhipe/wiki/RHIPEonNodesWithoutR
| Its such a relief this way, no need to worry about node configuration. I hear yahoo did something similar with python.
| On Nov 4, 2011 9:18 PM, "Dirk Eddelbuettel" <edd at debian.org<mailto:edd at debian.org>> wrote:
| On 4 November 2011 at 22:02, Daryl Waggott wrote:
| | Hello R HPC list,
| | In order to use any software tool in our production cluster environment we need to fully *bundle* or encapsulate all dependencies. This is due to a policy of reproducible and logged research which is not at the whims of a heterogeneous computing environment. My general question is --- are there any recommended strategies for compiling R with a set of packages that do not depend on (potentially changing) dynamically linked libraries? Static linking doesn't look practical in R, so I have been attempting to aggregate all the dependencies determined by the function *ldd* and putting them in a a directory that is referenced by setting the LD_LIBRARY_PATH environment variable. Needless to say, I haven't had much luck. The cluster is all 64 bit Debian. Are there any strategies at compile time that might be helpful? My knowledge of the ins and outs of all the configure and make parameters is a work in progress. Thank you for any advice or pointers to documentation.
| Why fight the package management system? Just record versions (or even
| store) all .deb packages needed to install R and the packages.
| A colleague of yours just asked about static builds on r-devel today... Some
| deadline looming, eh? ;-)
| "Outside of a dog, a book is a man's best friend. Inside of a dog, it is too
| dark to read." -- Groucho Marx
| R-sig-hpc mailing list
| R-sig-hpc at r-project.org<mailto:R-sig-hpc at r-project.org>
"Outside of a dog, a book is a man's best friend. Inside of a dog, it is too
dark to read." -- Groucho Marx
More information about the R-sig-hpc