[Rd] how to control the environment of a formula

Duncan Murdoch murdoch.duncan at gmail.com
Fri Apr 19 16:04:00 CEST 2013


On 13-04-18 11:39 AM, Thomas Alexander Gerds wrote:
> Dear Duncan
>
> thank you for taking the time to answer my questions! It will be quite
> some work to delete all the objects generated inside the function
> ... but if there is no other way to avoid a large environment then this
> is what I will do.

It's not really that hard.  Use names <- ls() in the function to get a 
list of all of them; remove the names of variables that might be needed 
in the formula (and the name of the formula itself); then use 
rm(list=names) to delete everything else just before returning it.

Duncan Murdoch

>
> Cheers
> Thomas
>
> Duncan Murdoch <murdoch.duncan at gmail.com> writes:
>
>> On 13-04-18 1:09 AM, Thomas Alexander Gerds wrote:
>>> Dear List
>>> I have experienced that objects generated with one of my packages
>>> used a lot of space when saved on disc (object.size did not show
>>> this!).
>>> some debugging revealed that formula and call objects carried the
>>> full environment of subroutines along, including even stuff not
>>> needed by the formula or call. here is a sketch of the problem
>>> ,----
>>> | test <- function(x){ x <- rnorm(1000000) out <- list() out$f <-
>>> | a~b out } v <- test(1) save(v,file="~/tmp/v.rda") system("ls -lah
>>> | ~/tmp/v.rda")
>>> | -rw-rw-r-- 1 tag tag 7,4M Apr 18 06:41 /home/tag/tmp/v.rda
>>> `----
>>> I tried to replace line 3 by
>>> ,----
>>> | as.formula(a~b,env=emptyenv()) or as.formula(a~b,env=NULL)
>>> `----
>>> without the desired effect. Instead adding either
>>> ,----
>>> | environment(out$f) <- emptyenv() or environment(out$f) <- NULL
>>> `----
>>> has the desired effect (i.e. the saved object size is
>>> shrunken). unfortunately there is a new problem:
>>> ,----
>>> | test <- function(x){ x <- rnorm(1000000) out <- list() out$f <-
>>> | a~b environment(out$f) <- emptyenv() out } d <-
>>> | data.frame(a=1,b=1) v <- test(1) model.frame(v$f,data=d)
>>> | Error in eval(expr, envir, enclos) : could not find function
>>> | "list"
>>> `----
>>> Same with NULL in place of emptyenv()
>>> Finally using .GlobalEnv in place of emptyenv() seems to remove both
>>> problems.
>>
>> But it will cause other, less obvious problems.  In a formula, the
>> symbols mean something.  By setting the environment to .GlobalEnv
>> you're changing the meaning.  You'll get nonsense in certain cases
>> when functions look up the meaning of those symbols and find the wrong
>> thing. (I don't have an example at hand, but I imagine it would be
>> easy to put one together with update().)
>>
>>> My questions:
>>> 1) why does the argument env of as.formula have no effect?
>>
>> Because the first argument already had an associated environment.  You
>> passed a ~ b, which is evaluated to a formula; calling as.formula on a
>> formula does nothing. The env argument is only used when a new formula
>> needs to be constructed.  (You can see this in the source code;
>> as.formula is a very simple function.)
>>
>>> 2) is there a better way to tell formula not to copy unrelated stuff
>>> into the associated environment?
>>
>> Yes, delete it.  For example, you could write your function as
>>
>>   test <- function(x){ x <- rnorm(1000000) out <- list() out$f <- a~b
>> rm(x) out }
>>
>>> 3) why does object.size not show the size of the environments that
>>> formulas can carry along?
>>
>> Because many objects can share the same environment.  See ?object.size
>> for more details.
>>
>> Duncan Murdoch
>



More information about the R-devel mailing list