[R] foreach/dopar's processes accumulate RAM

Alexander Engelhardt alex at chaotic-neutral.de
Wed Oct 29 09:48:05 CET 2014


Hello all,

I have a triple nested loop in R like this:

all <- list()
for(a in A){
     all[[a]] <- list()
     for(b in B){
         all[[a]][[b]] <- foreach(c=C, .combine=rbind) %dopar% {
             ## I'm leaving out some preprocessing here
             this_GAM <- gam(formula, data=data, family=nb(link="log", 
theta=THETA))
             predict(this_GAM, newdata=newdata)
         }
     }
}

The problem I have is that, with time, the individual R processes which 
the %dopar% spawns use up more and more RAM. When I start the triple 
loop, each process requires about 2GB of RAM, but after around eight 
hours, they use >4GB each. Here's the first two lines of a 'top' output:

   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND 

20880 engelhar  20   0 7042m 4.0g 2436 R 59.2  6.4  14:30.15 R 

20878 engelhar  20   0 7042m 4.3g 2436 D 53.5  6.8  14:07.18 R 


I don't understand how this can happen. To my understanding, as soon as 
the foreach loop is done, i.e. as soon as a new 'b' is chosen from 'B' 
in the second loop, the individual parallel R processes should terminate 
and release the memory. There should not be an increase of memory 
consumption over time.

Does anyone know what is going on and how I can avoid this behavior?

Thanks in advance,
  Alex Engelhardt



More information about the R-help mailing list