[Rd] Loose code in R package files

Prof Brian Ripley ripley at stats.ox.ac.uk
Thu Jun 16 10:15:43 CEST 2011


Peter Dalgaard's reply made me wonder if you were looking for a 
precise answer to

> what will actually happens at build and load time?

The best way to answer such questions is to read the source code (and 
now this is all in R, it should be accessible to all).

Nothing happens at 'build time', i.e. when R CMD build is run.

How things are split between install and load time depends on whether 
the package uses lazy loading.  Some of us think that maybe all 
packages should, not least to simplify code and explanations like 
this.

No lazy loading:

The R source files are concatenated at install time (in an order 
described in 'Writing R Extensions'), and the resulting file stored as 
<pkg>/R/<pkg>.  When the package (or its name space) is loaded, that 
file is effectively source()d into an environment.  (There are some 
small differences as sys.source() is used.)  So side effects may 
affect the current R session, and some options (such as 
'keep.source.pkgs') are acted on at load time, which is when the R 
objects are created.

Lazy loading:

The R source files are concatenated at install time (in an order
described in 'Writing R Extensions'), to a single file <pkg>/R/<pkg> 
[as before].

That file is then (at install time) effectively source()d into an 
environment and that environment is dumped to a lazy-load database, 
and <pkg>/R/<pkg> replaced by a stub loader.  So any side effects 
apply only to the R process which does the loading and dumping, and 
the objects are created at install time.

At load time, the stub loader creates promises to objects in the
database in an environment in the current session.

There is rather more to it, e.g. for lazydata and packages with 
sysdata.rda, but that is gist.

On Wed, 15 Jun 2011, Prof Brian Ripley wrote:

> On Tue, 14 Jun 2011, Stephen Ellison wrote:
>
>> I'm sure i've seen the answer to this, but can't find it:
>> 
>> If there is executable code in an R package .R file that does not return a 
>> function (that is, something like x <- rnorm(5), outside any function body 
>> ), what will actually happens at build and load time?
>> 
>> And (more importantly, so I can point a colleague to documented guidance on 
>> the matter) is there somewhere in R's docs that says why this is not likely 
>> to be a good idea and suggests the sort of things it is sensible - or not 
>> sensible - to include in .R files?
>
> It is documented in 'Writing R Extensions':
>
> The R subdirectory contains R code files, only.
> ...
> It should be possible to read in the files using source(), so R objects must 
> be created by assignments. Note that there need be no connection between the 
> name of the file and the R objects created by it. Ideally, the R code files 
> should only directly assign R objects and definitely should not call 
> functions with side effects such as require and options. If computations are 
> required to create objects these can use code `earlier' in the package (see 
> the `Collate' field) plus, only if lazyloading is used, functions in the 
> `Depends' packages provided that the objects created do not depend on those 
> packages except via name space imports. (Packages without name spaces will 
> work under somewhat less restrictive assumptions.)
>
> So your example creates an object 'x' in the package or name space. Which is 
> perfectly legal, but maybe not intentional.  For example, R's base package 
> does
>
> ## needs to run after paste()
> .leap.seconds <- local({
>    .leap.seconds <-
>        c("1972-6-30", "1972-12-31", "1973-12-31", "1974-12-31",
>          "1975-12-31", "1976-12-31", "1977-12-31", "1978-12-31",
>          "1979-12-31", "1981-6-30", "1982-6-30", "1983-6-30",
>          "1985-6-30", "1987-12-31", "1989-12-31", "1990-12-31",
>          "1992-6-30", "1993-6-30", "1994-6-30","1995-12-31",
>          "1997-6-30", "1998-12-31", "2005-12-31", "2008-12-31")
>    .leap.seconds <- strptime(paste(.leap.seconds , "23:59:60"),
>                              "%Y-%m-%d %H:%M:%S")
>    c(as.POSIXct(.leap.seconds, "GMT")) # lose the timezone
> })
>
>
> S Ellison
> -- 
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list