[Rd] Return function from function with minimal environment

Henrik Bengtsson hb at maths.lth.se
Tue Apr 4 17:02:16 CEST 2006


On 4/4/06, Thomas Lumley <tlumley at u.washington.edu> wrote:
> On Tue, 4 Apr 2006, Henrik Bengtsson wrote:
>
> > Hi,
> >
> > this relates to the question "How to set a former environment?" asked
> > yesterday.  What is the best way to to return a function with a
> > minimal environment from a function? Here is a dummy example:
> >
> > foo <- function(huge) {
> >  scale <- mean(huge)
> >  function(x) { scale * x }
> > }
> >
> > fcn <- foo(1:10e5)
> >
> > The problem with this approach is that the environment of 'fcn' does
> > not only hold 'scale' but also the memory consuming object 'huge',
> > i.e.
> >
> > env <- environment(fcn)
> > ll(envir=env)  # ll() from R.oo
> > #   member data.class dimension object.size
> > # 1   huge    numeric   1000000     4000028
> > # 2  scale    numeric         1          36
> >
> > save(env, file="temp.RData")
> > file.info("temp.RData")$size
> > # [1] 2007624
> >
> > I generate quite a few of these and my 'huge' objects are of order
> > 100Mb, and I want to keep memory usage as well as file sizes to a
> > minimum.  What I do now, is to remove variable from the local
> > environment of 'foo' before returning, i.e.
> >
> > foo2 <- function(huge) {
> >  scale <- mean(huge)
> >  rm(huge)
> >  function(x) { scale * x }
> > }
> >
> > fcn <- foo2(1:10e5)
> > env <- environment(fcn)
> > ll(envir=env)
> > #   member data.class dimension object.size
> > # 1  scale    numeric         1          36
> >
> > save(env, file="temp.RData")
> > file.info("temp.RData")$size
> > # [1] 156
> >
> > Since my "foo" functions are complicated and contains many local
> > variables, it becomes tedious to identify and remove all of them, so
> > instead I try:
> >
> > foo3 <- function(huge) {
> >  scale <- mean(huge);
> >  env <- new.env();
> >  assign("scale", scale, envir=env);
> >  bar <- function(x) { scale * x };
> >  environment(bar) <- env;
> >  bar;
> > }
> >
> > fcn <- foo3(1:10e5)
> >
> > But,
> >
> > env <- environment(fcn)
> > save(env, file="temp.RData");
> > file.info("temp.RData")$size
> > # [1] 2007720
> >
> > When I try to set the parent environment of 'env' to emptyenv(), it
> > does not work, e.g.
> >
> > fcn(2)
> > # Error in fcn(2) : attempt to apply non-function
> >
> > but with the new.env(parent=baseenv()) it works fine. The "base"
> > environment has the empty environment as a parent.  So, I try to do
> > the same myself, i.e. new.env(parent=new.env(parent=emptyenv())), but
> > once again I get
>
> I don't think you want to remove baseenv() from the environment. If you
> do, no functions from baseenv will be visible inside fcn. These include
> "{" and "*", which are necessary for your function. I think the error
> message comes from being unable to find "{".

Thank you, this makes sense. Modifying Roger Peng's example
illustrates what you say:

foo <- function(huge) {
        scale <- mean(huge)
        g <- function(x) x
        environment(g) <- emptyenv()
        g
}

fcn <- foo(1:10e5)
fcn(2)
# [1] 2

But as soon as you add "something" to the g(), it is missing;

foo <- function(huge) {
        scale <- mean(huge)
        g <- function(x) { x }
        environment(g) <- emptyenv()
        g
}

fcn <- foo(1:10e5)
fcn(2)
# Error in fcn(2) : attempt to apply non-function

...and I did not know that "{" and "(" are primitive functions.  Interesting.

I conclude that 'env <- new.env(parent=baseenv())' is better than
''env <- new.env()' in my case.

I learned something new. Thanks.

Henrik

> Also, there is no memory use from having baseenv in the environment, since
> all the objects in baseenv are always present.
>
>         -thomas
>
>
> Thomas Lumley                   Assoc. Professor, Biostatistics
> tlumley at u.washington.edu        University of Washington, Seattle
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>


--
Henrik Bengtsson
Mobile: +46 708 909208 (+2h UTC)



More information about the R-devel mailing list