[R-sig-hpc] Seeing some memory leak with foreach...

Jonathan Greenberg jgrn at illinois.edu
Wed Feb 27 20:25:24 CET 2013


Thanks all -- I'm going to try out these suggestions (running a for
loop, checking the function a bit more closely, trying some different
backends) and get back to you!

--j

On Wed, Feb 27, 2013 at 12:36 PM, Peter Langfelder
<peter.langfelder at gmail.com> wrote:
> On Tue, Feb 26, 2013 at 6:49 AM, Jonathan Greenberg <jgrn at illinois.edu> wrote:
>> r-sig-geo'ers:
>>
>> I always hate doing this, but the test function/dataset is going to be
>> hard to pass along to the list.  Basically: I have a foreach call that
>> has no superassignments or strange environmental manipulations, but
>> resulted in the nodes showing a slow but steady memory creep over
>> time.  I was using a parallel backend for foreach via doParallel.  Has
>> anyone else seen this behavior (unexplained memory creep)?  Is there a
>> good way to "flush" a node?  I'm trying to embed gc() at the top of my
>> foreach function, but this process took about 24 hours to get to a
>> memory overuse stage (multiple iterations would have passed, e.g. the
>> function would have been called more than one time on a single node)
>> so I'm not sure if this will work so I figured I'd ask the group about
>> it.  I've seen other people post about this on various boards with no
>> clear response/solution to it (gc() apparently didn't work).
>>
>> Some other notes: there should be no resultant output of data, because
>> the output is being written from within the foreach function (e.g. the
>> output of the function that foreach executes is NULL).
>>
>> I'll see if I can work up a faster executing example later, but wanted
>> to see if there are some general pointers for dealing with memory
>> leaks using a parallel system.
>
> Hi Jonathan,
>
> have you tried replacing the foreach(...) by a simple for ... to
> verify that the problem really is in the parallel execution, and not
> simply in the R code?
>
> I second Simon's suggestion to pay careful attention to possible
> side-effects and objects not going out of scope when you think they
> should (for example, if something somewhere references the environment
> of a function that already completed, the environment and all objects
> within it are not out of scope).
>
> Peter



--
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
607 South Mathews Avenue, MC 150
Urbana, IL 61801
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn307 at hotmail.com, Gchat: jgrn307, Skype: jgrn3007



More information about the R-sig-hpc mailing list