[Rd] R-devel Digest, Vol 149, Issue 22

Radford Neal radford at cs.toronto.edu
Sun Jul 26 20:14:02 CEST 2015


> From: Joshua Bradley <jgbradley1 at gmail.com>
> 
> I have been having issues using parallel::mclapply in a memory-efficient
> way and would like some guidance. I am using a 40 core machine with 96 GB
> of RAM. I've tried to run mclapply with 20, 30, and 40 mc.cores and it has
> practically brought the machine to a standstill each time to the point
> where I do a hard reset.

When mclapply forks to start a new process, the memory is initially
shared with the parent process.  However, a memory page has to be
copied whenever either process writes to it.  Unfortunately, R's
garbage collector writes to each object to mark and unmark it whenever
a full garbage collection is done, so it's quite possible that every R
object will be duplicated in each process, even though many of them
are not actually changed (from the point of view of the R programs).

One thing on my near-term to-do list for pqR is to re-implement R's
garbage collector in a way that will avoid this (as well as having
various other advantages, including less memory overhead per object).

   Radford Neal



More information about the R-devel mailing list