[Rd] [External] memory consumption of nested (un)serialize of sys.frames()
Andreas Kersting
r-deve| @end|ng |rom @ker@t|ng@de
Wed Apr 7 16:06:23 CEST 2021
Hi Luke,
Please see https://github.com/akersting/dumpTest for the package.
Here a session showing my issue:
> library(dumpTest)
> sessionInfo()
R version 4.0.5 (2021-03-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 10 (buster)
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.8.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.8.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] dumpTest_0.1.0
loaded via a namespace (and not attached):
[1] compiler_4.0.5
> for (i in 1:100) {
+ print(i)
+ print(system.time(f()))
+ }
[1] 1
user system elapsed
0.028 0.004 0.034
[1] 2
user system elapsed
0.067 0.008 0.075
[1] 3
user system elapsed
0.176 0.000 0.176
[1] 4
user system elapsed
0.335 0.012 0.349
[1] 5
user system elapsed
0.745 0.023 0.770
[1] 6
user system elapsed
1.495 0.060 1.572
[1] 7
user system elapsed
2.902 0.136 3.040
[1] 8
user system elapsed
5.753 0.272 6.034
[1] 9
user system elapsed
11.807 0.708 12.597
[1] 10
^C
Timing stopped at: 6.638 0.549 7.214
I had to interrupt in iteration 10 because I was running low on RAM.
Regards,
Andreas
2021-04-07 15:28 GMT+02:00 luke-tierney using uiowa.edu:
> On Wed, 7 Apr 2021, Andreas Kersting wrote:
>
>> Hi,
>>
>> please consider the following minimal reproducible example:
>>
>> Create a new R package which just contains the following two (exported) objects:
>
> I would not expect this behavior and I don't see it when I make such a
> package (in R 4.0.3 or R-devel on Ubuntu). You will need to provide a
> more complete reproducible example if you want help with what you are
> trying to do; also sessionInfo() would help.
>
> Best,
>
> luke
>
>>
>>
>> crash_dumps <- new.env()
>>
>> f <- function() {
>> x <- runif(1e5)
>> dump <- lapply(1:2, function(i) unserialize(serialize(sys.frames(), NULL)))
>> assign("last.dump", dump, crash_dumps)
>> }
>>
>>
>> WARNING: the following will probably eat all your RAM!
>>
>> Attach this package and run:
>>
>> for (i in 1:100) {
>> print(i)
>> f()
>> }
>>
>> You will notice that with each iteration the execution of f() slows down significantly while the memory consumption of the R process (v4.0.5 on Linux) quickly explodes.
>>
>> I am having a hard time to understand what exactly is happening here. Something w.r.t. too deeply nested environments? Could someone please enlighten me? Thanks!
>>
>> Regards,
>> Andreas
>>
>>
>> Background:
>> In an R package I store crash dumps on error in a parallel processes in a way similar to what I have just shown (hence the (un)serialize(), which happens as part of returning the objects to the parent process). The first 2 or 3 times I do so in a session everything is fine, but afterwards it takes very long and I soon run out of memory.
>>
>> Some more observations:
>> - If I omit `x <- runif(1e5)`, the issues seem to be less pronounced.
>> - If I assign to .GlobalEnv instead of crash_dumps, there seems to be no issue - probably because .GlobalEnv is not included in sys.frames(), while crash_dumps is indirectly via the namespace of the package being the parent.env of some of the sys.frames()!?
>> - If I omit the lapply(...), i.e. use `dump <- unserialize(serialize(sys.frames(), NULL))` directly, there seems to be no issue. The immediate consequence is that there are less sys.frames and - in particular - there is no frame which has the base namespace as its parent.env.
>> - If I make crash_dumps a list and use assignInMyNamespace() to store the dump in it, there also seems to be no issue. I will probably use this as a workaround:
>>
>> crash_dumps <- list()
>>
>> f <- function() {
>> x <- runif(1e5)
>> dump <- lapply(1:2, function(i) unserialize(serialize(sys.frames(), NULL)))
>> crash_dumps[["last.dump"]] <- dump
>> assignInMyNamespace("crash_dumps", crash_dumps)
>> }
>>
>> ______________________________________________
>> R-devel using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
> --
> Luke Tierney
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa Phone: 319-335-3386
> Department of Statistics and Fax: 319-335-3017
> Actuarial Science
> 241 Schaeffer Hall email: luke-tierney using uiowa.edu
> Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu
>
More information about the R-devel
mailing list