[Rd] save() saves extra stuff if object is not evaluated

Henrik Bengtsson hb at stat.berkeley.edu
Fri May 26 03:03:27 CEST 2006


On 5/25/06, Luke Tierney <luke at stat.uiowa.edu> wrote:
> On Thu, 25 May 2006, Henrik Bengtsson wrote:
>
> > Hi,
> >
> > it looks like save() is saving all contents of the calling
> > environments if the object to be saved is *not* evaluated, although it
> > is not that simple either.
>
> No, it's exactly that simple.  Serialization follows and writes out
> all reachable environments.  Unevaluated promises contain the
> environments in which their evaluations are to occur; evaluated ones
> have this field set to R_NilValue to eliminate this no longer needed
> reference.
>
> There are two environments involved: the calling environment in which
> saveCache is called and the callee environment of the call to
> saveCache where the body of saveCache is evaluated.  Because of
> lexical scope the enclosing environment of the callee environment is
> the closure environment of saveCache, which is .GlobalEnv.
>
> The call to saveCache creates a promise for evaluating the default
> value for 'source' _in the callee environment_. In the case with y the
> callee environment includes a value of y which is a promise
> referencing the calling environment (either .GlobalENv or the
> environment of the call to main).  In the calls without y the value of
> y in the calling environment is the missing value indicator, not a
> promise.  So only with y and no eval is there a reference to the
> calling environment that serialization then has to write out.

Thank you very much for this sharp explanation.  It is now much
clearer to me what is going on.  Would it make sense to make save()
evaluate all non-evaluated arguments, e.g. is.null(list(...))?  ...or,
add an argument making this optional/default?

Best wishes,

Henrik

> Best,
>
> luke
>
>
> > After many hours of troubleshooting, I'm
> > still confused.  Here is a reproducible example (also attached) with
> > output.  I let the code and the output talk for itself:
> >
> > peek <- function(file, from=1, to=500) {
> > cat("--------------------------------------\n")
> > cat(sprintf("%s: %d bytes\n", file, file.info(file)$size))
> > bfr <- suppressWarnings(readBin(file, what="character", n=to))
> > bfr <- gsub("(\001|\002|\003|\004|\005|\016|\020|\036|\a|\n|\t)", "", bfr);
> > bfr <- bfr[nchar(bfr) > 0];
> > cat(bfr, sep="", "\n");
> > }
> >
> > saveCache <- function(file, y, sources=NULL, eval=FALSE) {
> > if (eval)
> >   dummy <- is.null(sources)
> > base::save(file=file, sources, compress=FALSE)
> > }
> >
> > aVariableNotSaved <- double(1e6)
> >
> > main <- function() {
> > # This 'big' variable is saved in case 1 below!
> > big <- rep(letters, length.out=1e5)
> > identifier <- "This string will be saved too!"
> >
> > y <- 1
> >
> > file <- "a.RData"
> > saveCache(y, file=file)
> > peek(file)
> >
> > file <- "a-eval.RData"
> > saveCache(y, file=file, eval=TRUE)
> > peek(file)
> >
> > file <- "b-noy.RData"
> > saveCache(file=file)
> > peek(file)
> >
> > file <- "b-noy-eval.RData"
> > saveCache(file=file, eval=TRUE)
> > peek(file)
> > }
> >
> >
> > # 1. Call saveCache() outside main()
> > eval(body(main))
> > # --------------------------------------
> > # a.RData: 238 bytes
> > # RDX2Xsources淬ilea.RData y爽 $  n�eval戌好�> # --------------------------------------
> > # a-eval.RData: 58 bytes
> > # RDX2Xsources戌
> > # --------------------------------------
> > # b-noy.RData: 230 bytes
> > # RDX2Xsources淬ile?b-noy.RData 鈇v$  n�eval戌好�> # --------------------------------------
> > # b-noy-eval.RData: 58 bytes
> > # RDX2Xsources戌
> >
> > # 2. Call saveCache() from within main()
> > main()
> > # --------------------------------------
> > # a.RData: 900412 bytes
> > # RDX2Xsources淬ilea.RData y�a.RData ?=identifierThis
> > # string will be saved too!big槦abcdefghijklmnopqrstuv
> > # wxyzabcdefghijklmnopqrstuvwxyzabcdefg
> > # --------------------------------------
> > # a-eval.RData: 58 bytes
> > # RDX2Xsources戌
> > # --------------------------------------
> > # b-noy.RData: 230 bytes
> > # RDX2Xsources淬ile?b-noy.RData 鈇v$  n�eval戌好�> # --------------------------------------
> > # b-noy-eval.RData: 58 bytes
> > # RDX2Xsources戌
> >
> > What is going on?
> >
> > I get this on both R v2.3.0 patched (2006-04-28 r37936) and R v2.3.1
> > beta (2006-05-23 r38179) on my WinXP (with Rterm --vanilla).
> >
>
> --
> Luke Tierney
> Chair, Statistics and Actuarial Science
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa                  Phone:             319-335-3386
> Department of Statistics and        Fax:               319-335-3017
>     Actuarial Science
> 241 Schaeffer Hall                  email:      luke at stat.uiowa.edu
> Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu
>



More information about the R-devel mailing list