[Rd] Model object, when generated in a function, saves entire environment when saved
Harvey Smith
h@rvey13131 @end|ng |rom gm@||@com
Thu Jan 30 22:53:47 CET 2020
Depending on if you need the data in the referenced environments later, you
could fit the model normally and use the refhook argument in
saveRDS/readRDS to replace references to environments in the model with a
dummy value.
normal_lm <- function(){
junk <- runif(1e+08)
lm(Sepal.Length ~ Sepal.Width, data = iris)
}
object = normal_lm()
tf <- tempfile(fileext = ".rds")
saveRDS(object, file = tf, refhook = function(...) {""})
object2 = readRDS(file = tf, refhook = function(...) { .GlobalEnv })
file.size(tf)
On Wed, Jan 29, 2020 at 3:24 PM Duncan Murdoch <murdoch.duncan using gmail.com>
wrote:
> On 29/01/2020 2:25 p.m., Kenny Bell wrote:
> > Reviving an old thread. I haven't noticed this be a problem for a while
> > when saving RDS's which is great. However, I noticed the problem again
> when
> > saving `qs` files (https://github.com/traversc/qs) which is an RDS
> > replacement with a fast serialization / compression system.
> >
> > I'd like to get an idea of what change was made within R to address this
> > issue for `saveRDS`. My thought is that this will help the author of the
> > `qs` package do something similar. I have had a browse through the
> release
> > notes for the last few years (Ctrl-F-ing "environment") and couldn't see
> it.
>
> The vector 1:1e+08 is stored very compactly in recent R versions (the
> start and end plus a marker that it's a sequence), and it appears
> saveRDS takes advantage of that while qs::qsave doesn't. That's not a
> very useful test, because environments typically aren't filled with long
> sequence vectors. If you replace the line
>
> junk <- 1:1e+08
>
> with
>
> junk <- runif(1e+08)
>
> you'll see drastically different results:
>
> > save_size_qs(normal_lm())
> [1] 417953609
> > #> [1] 848396
> > save_size_rds(normal_lm())
> [1] 532614827
> > #> [1] 4163
> > save_size_qs(normal_ggplot())
> [1] 417967987
>
> > #> [1] 857446
> > save_size_rds(normal_ggplot())
> [1] 532624477
> > #> [1] 12895
>
> Duncan Murdoch
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
[[alternative HTML version deleted]]
More information about the R-devel
mailing list