[Rd] parallel: Memory improvement to PSOCK clusters (PATCH)

Henrik Bengtsson henrik.bengtsson at gmail.com
Wed Oct 5 20:28:11 CEST 2016


I would like to bump the attention of a very simple patch to
parallel:::slaveLoop(), which I've already submitted as
https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=17115.  The patch
lowers the memory overhead for anyone using parallel::makeCluster().

The patch makes sure that the workers remove their results / values as
soon as they've been transferred back to the master process.  They
also remove any incoming objects / values as soon as possible.   For
instance, if a PSOCK worker produces 1 GiB objects in each iteration,
it is currently holding on to the old result while working on the new
one resulting in an unnecessary 1 GiB memory overhead.  This patch
avoids this.

Index: src/library/parallel/R/worker.R
===================================================================
--- src/library/parallel/R/worker.R (revision 70874)
+++ src/library/parallel/R/worker.R (working copy)
@@ -44,7 +44,9 @@
                 t2 <- proc.time()
                 value <- list(type = "VALUE", value = value, success = success,
                               time = t2 - t1, tag = msg$data$tag)
+                rm(list = "msg")
                 sendData(master, value)
+                rm(list = "value")
             }
         }, interrupt = function(e) NULL)
 }

Thanks,

Henrik



More information about the R-devel mailing list