[Rd] mclapply memory leak?

Toby Hocking tdhock5 at gmail.com
Thu Sep 3 15:26:33 CEST 2015


right, it is not a memory leak, sorry for the misleading subject line.

the problem is the fact that the memory usage goes up, linearly with the
length of the first argument to mclapply. in practice with large data sets
this can cause the machine to start swapping, or to have my cluster jobs
killed due to using too much memory.

On Wed, Sep 2, 2015 at 2:35 PM, Gabriel Becker <gmbecker at ucdavis.edu> wrote:

> Well it's only a leak if you don't get the memory back after it returns,
> right?
>
> Anyway, one (untested by me) possibility is the copying of memory pages
> when the garbage collector touches objects, as pointed out by Radford Neal
> here:
> http://r.789695.n4.nabble.com/Re-R-devel-Digest-Vol-149-Issue-22-td4710367.html
>
> If so, I don't think this would be easily avoidable, but there may be
> mitigation strategies.
>
> ~G
>
> On Wed, Sep 2, 2015 at 10:12 AM, Toby Hocking <tdhock5 at gmail.com> wrote:
>
>> Dear R-devel,
>>
>> I am running mclapply with many iterations over a function that modifies
>> nothing and makes no copies of anything. It is taking up a lot of memory,
>> so it seems to me like this is a bug. Should I post this to
>> bugs.r-project.org?
>>
>> A minimal reproducible example can be obtained by first starting a memory
>> monitoring program such as htop, and then executing the following code
>> while looking at how much memory is being used by the system
>>
>> library(parallel)
>> seconds <- 5
>> N <- 100000
>> result.list <- mclapply(1:N, function(i)Sys.sleep(1/N*seconds))
>>
>> On my system, memory usage goes up about 60MB on this example. But it does
>> not go up at all if I change mclapply to lapply. Is this a bug?
>>
>> For a more detailed discussion with a figure that shows that the memory
>> overhead is linear in N, please see
>> https://github.com/tdhock/mclapply-memory
>>
>> > sessionInfo()
>> R version 3.2.2 (2015-08-14)
>> Platform: x86_64-pc-linux-gnu (64-bit)
>> Running under: Ubuntu precise (12.04.5 LTS)
>>
>> locale:
>>  [1] LC_CTYPE=en_CA.UTF-8       LC_NUMERIC=C
>>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_CA.UTF-8
>>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_CA.UTF-8
>>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>>  [9] LC_ADDRESS=C               LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] parallel  graphics  utils     datasets  stats     grDevices methods
>> [8] base
>>
>> other attached packages:
>> [1] ggplot2_1.0.1      RColorBrewer_1.0-5 lattice_0.20-33
>>
>> loaded via a namespace (and not attached):
>>  [1] Rcpp_0.11.6             digest_0.6.4            MASS_7.3-43
>>  [4] grid_3.2.2              plyr_1.8.1              gtable_0.1.2
>>  [7] scales_0.2.3            reshape2_1.2.2          proto_1.0.0
>> [10] labeling_0.2            tools_3.2.2             stringr_0.6.2
>> [13] dichromat_2.0-0         munsell_0.4.2
>>  PeakSegJoint_2015.08.06
>> [16] compiler_3.2.2          colorspace_1.2-4
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
>
>
> --
> Gabriel Becker, PhD
> Computational Biologist
> Bioinformatics and Computational Biology
> Genentech, Inc.
>

	[[alternative HTML version deleted]]



More information about the R-devel mailing list