[BioC] Biobase/reposTools Dependency
Colin A. Smith
colin at colinsmith.org
Mon Jun 16 13:25:41 MEST 2003
I am looking into using Bioconductor as a slave batch processor for
other programs. (BASE, other web tools, etc.) I've found that when
using Bioconductor for the statistical analysis, a majority of the CPU
time gets spent just loading it and not on the actual computation,
which seems like a waste. This added overhead makes providing
reasonably interactive output to the user difficult. While it might be
possible to set up some sort of persistent R session that gets used,
I'd rather reduce the complexity and just run R anew for each analysis.
Breaking the dependency of Biobase on reposTools strikes me as a
particularly effective way to optimize the load sequence. They both
take a long time to load relative to the loading of R itself. (This
seems to stem mostly from the use of methods. Profiling shows that 60%
of CPU time gets spent in setMethod while loading reposTools.)
While it's nice to be able to automagically download and install R
libraries, the overwhelming majority of R sessions probably don't use
this. (Especially if the R library directory isn't owned by the current
user. The warning that pops up when this happens is another
annoyance...) Is there some other showstopping reason for Biobase
depending on reposTools?
More information about the Bioconductor
mailing list