[Rd] Return function from function with minimal environment
Henrik Bengtsson
hb at maths.lth.se
Tue Apr 4 17:38:53 CEST 2006
On 4/4/06, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:
> On 4/4/06, Henrik Bengtsson <hb at maths.lth.se> wrote:
> > On 4/4/06, Thomas Lumley <tlumley at u.washington.edu> wrote:
> > > On Tue, 4 Apr 2006, Henrik Bengtsson wrote:
> > >
> > > > Hi,
> > > >
> > > > this relates to the question "How to set a former environment?" asked
> > > > yesterday. What is the best way to to return a function with a
> > > > minimal environment from a function? Here is a dummy example:
> > > >
> > > > foo <- function(huge) {
> > > > scale <- mean(huge)
> > > > function(x) { scale * x }
> > > > }
> > > >
> > > > fcn <- foo(1:10e5)
> > > >
> > > > The problem with this approach is that the environment of 'fcn' does
> > > > not only hold 'scale' but also the memory consuming object 'huge',
> > > > i.e.
> > > >
> > > > env <- environment(fcn)
> > > > ll(envir=env) # ll() from R.oo
> > > > # member data.class dimension object.size
> > > > # 1 huge numeric 1000000 4000028
> > > > # 2 scale numeric 1 36
> > > >
> > > > save(env, file="temp.RData")
> > > > file.info("temp.RData")$size
> > > > # [1] 2007624
> > > >
> > > > I generate quite a few of these and my 'huge' objects are of order
> > > > 100Mb, and I want to keep memory usage as well as file sizes to a
> > > > minimum. What I do now, is to remove variable from the local
> > > > environment of 'foo' before returning, i.e.
> > > >
> > > > foo2 <- function(huge) {
> > > > scale <- mean(huge)
> > > > rm(huge)
> > > > function(x) { scale * x }
> > > > }
> > > >
> > > > fcn <- foo2(1:10e5)
> > > > env <- environment(fcn)
> > > > ll(envir=env)
> > > > # member data.class dimension object.size
> > > > # 1 scale numeric 1 36
> > > >
> > > > save(env, file="temp.RData")
> > > > file.info("temp.RData")$size
> > > > # [1] 156
> > > >
> > > > Since my "foo" functions are complicated and contains many local
> > > > variables, it becomes tedious to identify and remove all of them, so
> > > > instead I try:
> > > >
> > > > foo3 <- function(huge) {
> > > > scale <- mean(huge);
> > > > env <- new.env();
> > > > assign("scale", scale, envir=env);
> > > > bar <- function(x) { scale * x };
> > > > environment(bar) <- env;
> > > > bar;
> > > > }
> > > >
> > > > fcn <- foo3(1:10e5)
> > > >
> > > > But,
> > > >
> > > > env <- environment(fcn)
> > > > save(env, file="temp.RData");
> > > > file.info("temp.RData")$size
> > > > # [1] 2007720
> > > >
> > > > When I try to set the parent environment of 'env' to emptyenv(), it
> > > > does not work, e.g.
> > > >
> > > > fcn(2)
> > > > # Error in fcn(2) : attempt to apply non-function
> > > >
> > > > but with the new.env(parent=baseenv()) it works fine. The "base"
> > > > environment has the empty environment as a parent. So, I try to do
> > > > the same myself, i.e. new.env(parent=new.env(parent=emptyenv())), but
> > > > once again I get
> > >
> > > I don't think you want to remove baseenv() from the environment. If you
> > > do, no functions from baseenv will be visible inside fcn. These include
> > > "{" and "*", which are necessary for your function. I think the error
> > > message comes from being unable to find "{".
> >
> > Thank you, this makes sense. Modifying Roger Peng's example
> > illustrates what you say:
> >
> > foo <- function(huge) {
> > scale <- mean(huge)
> > g <- function(x) x
> > environment(g) <- emptyenv()
> > g
> > }
> >
> > fcn <- foo(1:10e5)
> > fcn(2)
> > # [1] 2
> >
> > But as soon as you add "something" to the g(), it is missing;
> >
> > foo <- function(huge) {
> > scale <- mean(huge)
> > g <- function(x) { x }
> > environment(g) <- emptyenv()
> > g
> > }
> >
> > fcn <- foo(1:10e5)
> > fcn(2)
> > # Error in fcn(2) : attempt to apply non-function
> >
> > ...and I did not know that "{" and "(" are primitive functions. Interesting.
> >
> > I conclude that 'env <- new.env(parent=baseenv())' is better than
> > ''env <- new.env()' in my case.
>
> Is there any reason to use
>
> env <- new.env(parent=baseenv())
>
> instead of just
>
> env <- baseenv() ?
>
> The extra environment being created seems to serve no purpose.
I need to do this, because I do not want to assign 'scale' to the base
environment:
foo <- function(huge) {
scale <- mean(huge)
env <- new.env(parent=baseenv())
# cf. env <- baseenv()
assign("scale", scale, envir=env)
bar <- function(x) { scale * x }
environment(bar) <- env
bar
}
/Henrik
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
--
Henrik Bengtsson
Mobile: +46 708 909208 (+2h UTC)
More information about the R-devel
mailing list