[Rd] Model object, when generated in a function, saves entire environment when saved
Duncan Murdoch
murdoch@dunc@n @end|ng |rom gm@||@com
Wed Jan 29 21:24:22 CET 2020
On 29/01/2020 2:25 p.m., Kenny Bell wrote:
> Reviving an old thread. I haven't noticed this be a problem for a while
> when saving RDS's which is great. However, I noticed the problem again when
> saving `qs` files (https://github.com/traversc/qs) which is an RDS
> replacement with a fast serialization / compression system.
>
> I'd like to get an idea of what change was made within R to address this
> issue for `saveRDS`. My thought is that this will help the author of the
> `qs` package do something similar. I have had a browse through the release
> notes for the last few years (Ctrl-F-ing "environment") and couldn't see it.
The vector 1:1e+08 is stored very compactly in recent R versions (the
start and end plus a marker that it's a sequence), and it appears
saveRDS takes advantage of that while qs::qsave doesn't. That's not a
very useful test, because environments typically aren't filled with long
sequence vectors. If you replace the line
junk <- 1:1e+08
with
junk <- runif(1e+08)
you'll see drastically different results:
> save_size_qs(normal_lm())
[1] 417953609
> #> [1] 848396
> save_size_rds(normal_lm())
[1] 532614827
> #> [1] 4163
> save_size_qs(normal_ggplot())
[1] 417967987
> #> [1] 857446
> save_size_rds(normal_ggplot())
[1] 532624477
> #> [1] 12895
Duncan Murdoch
More information about the R-devel
mailing list