[Rd] how to control the environment of a formula
Duncan Murdoch
murdoch.duncan at gmail.com
Sat Apr 20 19:44:35 CEST 2013
On 13-04-19 2:57 PM, Thomas Alexander Gerds wrote:
>
> hmm. I have tested a bit more, and found this perhaps more difficult
> solve situation. even though I delete x, since x is part of the output
> of the formula, the size of the object is twice as much as it should be:
>
> test <- function(x){
> x <- rnorm(1000000)
> out <- list(x=x)
> rm(x)
> out$f <- as.formula(a~b)
> out
> }
> v <- test(1)
> x <- rnorm(1000000)
> save(v,file="~/tmp/v.rda")
> save(x,file="~/tmp/x.rda")
> system("ls -lah ~/tmp/*.rda")
>
> -rw-rw-r-- 1 tag tag 15M Apr 19 20:52 /home/tag/tmp/v.rda
> -rw-rw-r-- 1 tag tag 7,4M Apr 19 20:52 /home/tag/tmp/x.rda
>
> can you solve this as well?
Yes, this is tricky. The problem is that "out" is in the environment of
out$f, so you get two copies when you save it. (I think you won't have
two copies in memory, because R only makes a copy when it needs to, but
I haven't traced this.)
Here are two solutions, both have some problems.
1. Don't put out in the environment:
test <- function(x) {
x <- rnorm(1000000)
out$x <- list(x=x)
out$f <- a ~ b # the as.formula() was never needed
# temporarily create a new environment
local({
# get a copy of what you want to keep
out <- out
# remove everything that you don't need from the formula
rm(list=c("x", "out"), envir=environment(out$f))
# return the local copy
out
})
}
I don't like this because it is too tricky, but you could probably wrap
the tricky bits into a little function (a variant on return() that
cleans out the environment first), so it's probably what I would use if
I was desperate to save space in saved copies.
2. Never evaluate the formula in the first place, so it doesn't pick up
the environment:
test <- function(x) {
x <- rnorm(1000000)
out$x <- list(x=x)
out$f <- quote(a ~ b)
out
}
This is a lot simpler, but it might not work with some modelling
functions, which would be confused by receiving the model formula
unevaluated. It also has the problems that you get with using
.GlobalEnv as the environment of the formula, but maybe to a slightly
lesser extent: rather than having what is possibly the wrong
environment, it doesn't have one at all.
Duncan Murdoch
>
> thanks!
> thomas
>
> Duncan Murdoch <murdoch.duncan at gmail.com> writes:
>
>> On 13-04-18 11:39 AM, Thomas Alexander Gerds wrote:
>>> Dear Duncan
>>> thank you for taking the time to answer my questions! It will be
>>> quite some work to delete all the objects generated inside the
>>> function ... but if there is no other way to avoid a large
>>> environment then this is what I will do.
>>
>> It's not really that hard. Use names <- ls() in the function to get a
>> list of all of them; remove the names of variables that might be
>> needed in the formula (and the name of the formula itself); then use
>> rm(list=names) to delete everything else just before returning it.
>>
>> Duncan Murdoch
>>
More information about the R-devel
mailing list