[Rd] how to control the environment of a formula

Duncan Murdoch murdoch.duncan at gmail.com
Thu Apr 18 14:35:11 CEST 2013


On 13-04-18 1:09 AM, Thomas Alexander Gerds wrote:
> Dear List
>
> I have experienced that objects generated with one of my packages used
> a lot of space when saved on disc (object.size did not show this!).
>
> some debugging revealed that formula and call objects carried the full
> environment of subroutines along, including even stuff not needed by the
> formula or call. here is a sketch of the problem
>
> ,----
> | test <- function(x){
> |   x <- rnorm(1000000)
> |   out <- list()
> |   out$f <- a~b
> |   out
> | }
> | v <- test(1)
> | save(v,file="~/tmp/v.rda")
> | system("ls -lah ~/tmp/v.rda")
> |
> | -rw-rw-r-- 1 tag tag 7,4M Apr 18 06:41 /home/tag/tmp/v.rda
> `----
>
> I tried to replace line 3 by
>
> ,----
> | as.formula(a~b,env=emptyenv())
> | or
> | as.formula(a~b,env=NULL)
> `----
>
> without the desired effect. Instead adding either
>
> ,----
> | environment(out$f) <- emptyenv()
> | or
> | environment(out$f) <- NULL
> `----
>
> has the desired effect (i.e. the saved object size is
> shrunken). unfortunately there is a new problem:
>
> ,----
> | test <- function(x){
> |   x <- rnorm(1000000)
> |   out <- list()
> |   out$f <- a~b
> |   environment(out$f) <- emptyenv()
> |   out
> | }
> | d <- data.frame(a=1,b=1)
> | v <- test(1)
> | model.frame(v$f,data=d)
> |
> | Error in eval(expr, envir, enclos) : could not find function "list"
> `----
>
> Same with NULL in place of emptyenv()
>
> Finally using .GlobalEnv in place of emptyenv() seems to remove both problems.

But it will cause other, less obvious problems.  In a formula, the 
symbols mean something.  By setting the environment to .GlobalEnv you're 
changing the meaning.  You'll get nonsense in certain cases when 
functions look up the meaning of those symbols and find the wrong thing. 
  (I don't have an example at hand, but I imagine it would be easy to 
put one together with update().)

> My questions:
>
> 1)  why does the argument env of as.formula have no effect?

Because the first argument already had an associated environment.  You 
passed a ~ b, which is evaluated to a formula; calling as.formula on a 
formula does nothing. The env argument is only used when a new formula 
needs to be constructed.  (You can see this in the source code; 
as.formula is a very simple function.)

> 2)  is there a better way to tell formula not to copy unrelated stuff
>      into the associated environment?

Yes, delete it.  For example, you could write your function as

  test <- function(x){
    x <- rnorm(1000000)
    out <- list()
    out$f <- a~b
    rm(x)
    out
  }


> 3)  why does object.size not show the size of the environments that
>      formulas can carry along?

Because many objects can share the same environment.  See ?object.size 
for more details.

Duncan Murdoch



More information about the R-devel mailing list