[Rd] help files for load and related functions

Martin Maechler maechler at stat.math.ethz.ch
Tue Dec 18 10:42:25 CET 2007

>>>>> "DM" == Duncan Murdoch <murdoch at stats.uwo.ca>
>>>>>     on Mon, 17 Dec 2007 09:36:48 -0500 writes:

    DM> On 12/17/2007 9:06 AM, Oleg Sklyar wrote:
    >> Dear Patrick,
    >> Firstly, and most importantly, I do not think that your post qualified
    >> for Rd! Please use the correct mail list for such things: R-help. I do
    >> not think anybody on Rd wants mailboxes clogged with irrelevant
    >> messages.

 { Oleg, you may have to be told that Pat Burns has been
   acquainted with the S language for a very long time, maybe
   about as long as you know to read... } 

    DM> Since Patrick's message was about changes to the documentation, I think 
    DM> it is relevant to this list.

yes indeed!
And the technicality of the discussion further down
is another good reason.

    DM> Duncan Murdoch

    >> Back to your question: it is not clear if you are confused, or your
    >> 'user' is confused, but all three help pages look pretty clear and
    >> straight forward to me. Moreover,  I do not see any connection between
    >> attach and library, which you find logical:
    >> - load - the general use of this one is to load external data sets, e.g.
    >> load serialised R object(s) (as the example shows). Until you load, you
    >> cannot use the object as it has no relation to the R session and can be
    >> e.g. a file sitting somewhere on a network
    >> - attach - the general use of this one would be to access elements of a
    >> data set directly, without the data set name specifier and the accessor
    >> operator, such as $, thus as the help page states - it is used to add
    >> the data set to the search path (as the example shows). If you look at
    >> the example, you do not have to call attach to be able to use data, data
    >> could have existed there before and what you effectively get with attach
    >> is a more convenient way of dealing with the data
    >> - library - is used to load *and* attach an R package, which is not
    >> exactly the same as a serialised R object(s), but a full set of other
    >> functionality. Attaching packages is just a part of the loading process,
    >> which occurs basically when the package becomes visible to the user.
    >> Same as with load, you cannot use the package until you load it. There
    >> is not a hint of similarity between loading a package and attaching a
    >> data set as I see it. 

Hmm, I think there is, ..... and there's more :

The function load() is well known for loading R objects into the
global environment; well known, easy to understand.  However, it
can load into any other environment; and environments are the
crucial entities here.
BUT  when talking about loading in the context of R packages *and*
namespaces (!), there are other things:

One important point I think was not mentioned yet, and is probably *the*
reason of potential confusion of useRs and even programmeRs: here

  library(<package>)  does conceptually two things

  1) it *loads* the (exported) objects from the installed package
      (or with lazy-loading just loads "stubs") into a new environment.
  2) it "attaches" the names of those objects to the search() path

where things happen a bit differently for namespaced and other
For namespaced packages the two steps are really nicely
separable on a user level:  I hope you've known
loadNamespace(), unloadNamespace(), attachNamespace() and the
fact that e.g. cluster::pam() loads cluster's package namespace 
but does not attach cluster to search().

 { If you want to delve and hence to look at the library()
   function, please do so in the sources, e.g.,
   which has many comments that are all gone in the 'library' function object.

I'd say: Because the loading part is the more delicate one than the
attach one, help(library) talks more about loading the package
than attaching..

Regards, Martin

    >> Regards,
    >> Oleg
    >> On Mon, 2007-12-17 at 11:00 +0000, Patrick Burns wrote:
    >>> I recently had a discussion with a user about loading
    >>> and attaching in R.  I was surprised that the help files
    >>> don't  provide a very clear picture.
    >>> From my point of view 'load' and 'attach' are very
    >>> similar operations, the difference being that 'attach'
    >>> creates a new database on the search list while 'load'
    >>> puts all the objects into the global environment.
    >>> The help file for 'load' is inexplicit that this is what
    >>> happens.  The 'load' and 'attach' help files neither refer
    >>> to the other in their See Also.
    >>> Furthermore, the 'library' help file talks about "loading"
    >>> packages.  I would suggest that it should use "attaching"
    >>> as that is the analogous operation.
    >>> None of these three help files (nor that of 'save') has a
    >>> Side Effects section.  Personally I think that all help files
    >>> should have a Side Effects section (to make it clear to
    >>> new users what side effects are and that they are not a
    >>> good thing for most functions to have).  I can understand
    >>> there could be another point of view on that.  However, I
    >>> definitely think that there should be a Side Effects section
    >>> in the help files of functions whose whole point is a side
    >>> effect.
    >>> Patrick Burns
    >>> patrick at burns-stat.com
    >>> +44 (0)20 8525 0696
    >>> http://www.burns-stat.com
    >>> (home of S Poetry and "A Guide for the Unwilling S User")
