[Rd] Model object, when generated in a function, saves entire environment when saved
William Dunlap
wdunlap at tibco.com
Wed Jul 27 20:19:44 CEST 2016
One way around this problem is to make a new environment whose
parent environment is .GlobalEnv and which contains only what the
the call to lm() requires and to compute lm() in that environment. E.g.,
tfun1 <- function (subset)
{
junk <- 1:1e+06
env <- new.env(parent = globalenv())
env$subset <- subset
with(env, lm(Sepal.Length ~ Sepal.Width, data = iris, subset = subset))
}
Then we get
> saveSize(tfun1(1:4)) # see below for def. of saveSize
[1] 910
instead of the 2129743 bytes in the save file when using the naive method.
saveSize <- function (object) {
tf <- tempfile(fileext = ".RData")
on.exit(unlink(tf))
save(object, file = tf)
file.size(tf)
}
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Wed, Jul 27, 2016 at 10:48 AM, Kenny Bell <kmb56 at berkeley.edu> wrote:
> In the below, I generate a model from an environment that isn't
> .GlobalEnv with a large object that is unrelated to the model
> generation. It seems to save the irrelevant object unnecessarily. In
> my actual use case, I am running and saving many models in a loop that
> each use a single large data.frame (that gets collapsed into a small
> data.frame for estimation), so removing it isn't an option.
>
> In the case where the model exists in .GlobalEnv, everything is
> peachy. So replicating whatever happens when saving the model that was
> generated in .GlobalEnv at the return() stage of the function call
> would fix this problem.
>
> I was referred to this list from r-bugs. First time r-devel poster.
>
> Hope this helps,
>
> Kendon
>
> ```
> tmp_fun <- function(x){
> iris_big <- lapply(1:10000, function(x) iris)
> lm(Sepal.Length ~ Sepal.Width, data = iris)
> }
>
> out <- tmp_fun(1)
> object.size(out)
> # 48008
> save(out, file = "tmp.RData", compress = FALSE)
> file.size("tmp.RData")
> # 57196752 - way too big
>
> # Works fine when in .GlobalEnv
> iris_big <- lapply(1:10000, function(x) iris)
> out <- lm(Sepal.Length ~ Sepal.Width, data = iris)
>
> object.size(out)
> # 48008
> save(out, file = "tmp.RData", compress = FALSE)
> file.size("tmp.RData")
> # 16641 - good size.
> ```
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
[[alternative HTML version deleted]]
More information about the R-devel
mailing list