[R-pkg-devel] Unused data is silently kept in the environment of a function

Samuel Granjeaud @@mue|@gr@nje@ud @end|ng |rom |n@erm@|r
Fri Jul 8 15:50:07 CEST 2022


Dear all,

I want to compute processing functions to apply to the data.
I apply the functions to the data in a second step.
proc_0 increases the memory, proc_1 is safe.
reprex below.

If this behavior is known, could you tell me a workaround before I try 
to guess the best one?

Best,
Samuel

``` r
# for memory tracking
library(pryr)

# a class
setClass(
   "fb",
   slots = list(d = "numeric", f = "list"),
   prototype=list(d = NULL, f = NULL)
)

# memory increased: keep dat somewhere and link it back to the returned 
value
proc_0 <- function(x) {
   dat = sample(x using d)
   cofactors = c(mean(dat), median(dat), IQR(dat))
   model = sapply(cofactors, function(cofactor) function(z) z / cofactor)
   x using f = list(model)
   x
}

# init data
mem_used()
#> 47 MB
a = new("fb")
a using d = sample(rnorm(1e7))
a using f = list()
mem_used()
#> 127 MB
# memory increased of 80 MB
# process
b = proc_0(a)
mem_used()
#> 207 MB
# memory increased of 80 MB again
rm(a)
mem_used()
#> 207 MB
# memory didn't decreased
b using d = b using d + 1
mem_used()
#> 287 MB
# memory increased
# b using d was really pointing to a using d before increment
sapply(1:3, function(i) ls(environment(b using f[[1]][[i]])))
#> [1] "cofactor" "cofactor" "cofactor"
sapply(1:3, function(i) get("cofactor", environment(b using f[[1]][[i]])))
#> [1] -0.0003085559  0.0001107148  1.3485980291
# environments look fine
rm(b)
mem_used()
#> 47.5 MB
# memory released back


# memory safe
proc_1 <- function(x) {
   cofactors = c(mean(x using d), median(x using d), IQR(x using d))
   model = sapply(cofactors, function(cofactor) function(z) z / cofactor)
   x using f = list(model)
   x
}

# init data
mem_used()
#> 47.5 MB
a = new("fb")
a using d = sample(rnorm(1e7))
a using f = list()
mem_used()
#> 128 MB
b = proc_1(a)
mem_used()
#> 128 MB
# memory didn't increased; b using d points to a using d; functions weight a few KB
rm(a)
mem_used()
#> 128 MB
sapply(1:3, function(i) ls(environment(b using f[[1]][[i]])))
#> [1] "cofactor" "cofactor" "cofactor"
sapply(1:3, function(i) get("cofactor", environment(b using f[[1]][[i]])))
#> [1] -0.0003133312 -0.0002510665  1.3491459433

rm(b)
mem_used()
#> 47.5 MB

```

<sup>Created on 2022-07-08 by the [reprex 
package](https://reprex.tidyverse.org) (v2.0.1)</sup>

<details style="margin-bottom:10px;">
<summary>
Session info
</summary>

``` r
sessionInfo()
#> R version 4.2.1 (2022-06-23 ucrt)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 19044)
#>
#> Matrix products: default
#>
#> locale:
#> [1] LC_COLLATE=French_France.utf8 LC_CTYPE=French_France.utf8
#> [3] LC_MONETARY=French_France.utf8 LC_NUMERIC=C
#> [5] LC_TIME=French_France.utf8
#>
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets methods   base
#>
#> other attached packages:
#> [1] pryr_0.1.5
#>
#> loaded via a namespace (and not attached):
#>  [1] Rcpp_1.0.8.3     codetools_0.2-18 digest_0.6.29 withr_2.5.0
#>  [5] magrittr_2.0.3   reprex_2.0.1     evaluate_0.15 highr_0.9
#>  [9] stringi_1.7.6    rlang_1.0.3      cli_3.3.0 rstudioapi_0.13
#> [13] fs_1.5.2         lobstr_1.1.2     rmarkdown_2.14 tools_4.2.1
#> [17] stringr_1.4.0    glue_1.6.2       xfun_0.31 yaml_2.3.5
#> [21] fastmap_1.1.0    compiler_4.2.1   htmltools_0.5.2 knitr_1.39
```

</details>



More information about the R-package-devel mailing list