RFC: Loading packages at startup
Fri, 25 Oct 2002 09:15:07 +0200
>>>>> Prof Brian D Ripley writes:
> I've been kicking the following idea around for a while, and am now
> proposing to put some version into 1.7.0. I'd be interested in
> comments on the desirability and the design, before I start writing
> any code.
> S4 introduced a file .S.chapters which can contain a list of S
> chapters (equivalent to R packages) to be loaded on start-up. This
> was the germ of this proposal.
> Extend the initialization as described in Startup.Rd by having
> optionally files named, say, R_HOME/etc/Rpackages.site and .Rpackages,
> the latter in the starting directory or failing that the user's home
> directory. Each would contain a list of packages, one per line, to be
> loaded when R is started, in the order in the files.
> Some details:
> 1) I think the packages should be loaded before .Rprofile and .RData
> are processed, and R_HOME/etc/Rpackages.site before .Rpackages.
> This can be argued, and the S4 parallel would seem to be to load
> packages after S.init (the nearest it has to Rprofile). But we
> would load library/base/Rprofile first of all so the analogy is not
> 2) The present kludge of loading ctest in .First could be replaced
> by making ctest the default content of R_HOME/etc/Rpackages.site
> (in the light of point 5).
> 3) It would be useful to allow the library tree to be specified, as a
> second field on the line.
> 4) One problem with saving an R session and then restoring it is that
> the packages in use are not reloaded. Quitting an R session and
> saving could write .Rpackages in the current directory (with the
> library recorded if it were not the default). Then restarting a
> session in that directory would restore the loaded packages
> 5) We might want to allow a .Rpackages file to override Rpackages.site
> (or we might not). One idea is to allow a minus sign in front of a
> package name, and to merge the Rpackages.site and .Rpackages files
> before loading any packages. If we did this we probably need to be
> able to save the list of packages to be loaded (and can't easily save
> those not to be loaded), so perhaps -- as the first list of .Rpackages
> should empty the list.
> 6) One could argue for R_HOME/etc/Rpackages as the `system' file as
> well, and this might be useful if we break base up into smaller
> 7) I would allow comment lines in the files, starting with #.
> 8) The file names or names could be set by environment variables. It's
> strange that we allow the site file names for Rprofile and Renviron and
> the user file name for command histories to be set in that way.
I'd recommend against going this way.
In fact, I am not sure whether we really want to have *user environment*
files in the long run. We currently have two user files controlling
startup, .Renviron and .Rprofile. A split like this is necessary
because not all customization for R can be done from 'inside R', i.e.,
after R has been started, but needs to be done before that. To my
understanding, this includes setting LD_LIBRARY_PATH (or the system's
equivalent), and maybe some env vars which can be given instead of
command line args, but with R_VSIZE and R_HSIZE sort of gone what else
is there? So perhaps the user environment mechanism is not the right
thing anyway (and it is not used by R CMD *).
For the things that can be done from inside R, I'd recommend doing it
this way. E.g., there is really no reason for looking for an env var
R_BROWSER when we can portably specify one using options(browser). The
fact that specifying the packages to be loaded at startup inside
.First() and that users cannot simply add to .First() is a problem that
needs to be addressed anyway. As Robert has indicated, the obvious idea
is to introduce (an Emacs-style) hooks mechanism to be run at certain
times, or more generally, when certain events occur. Unfortunately, the
attempt to make this rather general, as discussed in Boston, means that
things also take a bit longer to get done. But conceptually, what we
want is a suite of
setHook() addHook() runHooks()
functions, and things like .First(), .First.lib() (and the user
variants), code to be run when creating a save image etc., can all be
integrated into this general mechanism.
[Basically, hooks are lists of functions because in some cases we need
to call them with certain arguments, e.g. the .First.lib package load
hook always has library and package and maybe also version eventually.]
For package config, users can then use setHook() to override the system
and/or site defaults, or addHook() to add to them. As everything
happens inside R there is no need for a special format, as discussed in
(3), (5), and (7).
[If users cannot be expected to code their preferences in R ... then we
could provide an interactive tool which eventually emits the R code.]
Long term, startup configuration will include setting defaults, loading
packages, perhaps playing namespace magic with some of them, perhaps
everything according to predefined 'themes' ... and I'd like to have all
of this in one place, if possible.
Point (4) is very important, in particular if we think about saving and
restoring the 'state' of an R session (as opposed to just the work
space). I think many of us (I know from at least Greg, Fritz and
myself) have code going in this direction, but then we need more than
just names of the packages. We should perhaps also know about attached
non-package objects etc. Also, I am not sure about how attempting to
load/attach in reverse order will interact with namespace import/export
and pre-computing package dependencies before loading them. But in any
case, dumping an object that somehow represents the session state would
be extremely important---but I do think we need an R object to represent
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: firstname.lastname@example.org