[R] Garbage collection problem

Peter Langfelder peter.langfelder at gmail.com
Fri Jan 4 01:01:47 CET 2013


Hello all,

I am running into a problem with garbage collection not being able to
free up all memory. Unfortunately I am unable to provide a minimal
self-contained example, although I can provide a self contained
example if anyone feels like wading through some 600 lines of code. I
would love to isolate the relevant parts from the code but whenever I
try to run a simpler example, the problem does not appear.

I run an algorithm that repeats the same calculation (on sampled, i.e.
different data) in each iteration. Each iteration uses relatively
large intermediate objects and calculations but returns a smaller
result; these results are then collated and returned from the main
function (call it myFnc). The problem is that memory used by the
intermediate calculations (it is difficult to say whether it's objects
or memory needed for apply calls) does not seem to be freed up even
after doing explicit garbage collection using gc() within the loop.

Thus, a call of something like

result = myFnc(arguments)

results is some memory that does not seem allocated to any visible
objects and yet is not freed up using gc(): After executing an actual
call to the offending function, gc() tells me that Vcells use 538.6
Mb, but the sum of object.size() of all objects listed by ls(all.names
= TRUE) is only 183.3 Mb.


The thing is that if I remove 'result' using rm(result) and do gc()
again, the memory used decreases by a lot.: gc() now reports 110.3 Mb
used in Vcells; this roughly corresponds to the sum of the sizes of
all objects returned by ls() (after removing 'result'), which is now
108.7 Mb. So used memory went down by something like 428 Mb but the
object.size of 'result' is only about 75 Mb.

Thus, it seems that the memory used by internal operations in myFun
that should be freed up upon the completion of the function call
cannot be released by garbage collection until the result of the
function call is also removed.

Like I said, I tried to replicate this behaviour on simple examples
but could not.

My question is, is this behaviour to be expected in complicated code,
or is it a bug that should be reported? Is there any way around it?

Thanks in advance for any insights or pointers.

Peter




More information about the R-help mailing list