[R] Memory Fragmentation in R
Nawaaz Ahmed
nawaaz at inktomi.com
Sat Feb 19 18:18:36 CET 2005
I have a data set of roughly 700MB which during processing grows up to
2G ( I'm using a 4G linux box). After the work is done I clean up (rm())
and the state is returned to 700MB. Yet I find I cannot run the same
routine again as it claims to not be able to allocate memory even though
gcinfo() claims there is 1.1G left.
At the start of the second time
===============================
used (Mb) gc trigger (Mb)
Ncells 2261001 60.4 3493455 93.3
Vcells 98828592 754.1 279952797 2135.9
Before Failing
==============
Garbage collection 459 = 312+51+96 (level 0) ...
1222596 cons cells free (34%)
1101.7 Mbytes of heap free (51%)
Error: cannot allocate vector of size 559481 Kb
This looks like a fragmentation problem. Anyone have a handle on this
situation? (ie. any work around?) Anyone working on improving R's
fragmentation problems?
On the other hand, is it possible there is a memory leak? In order to
make my functions work on this dataset I tried to eliminate copies by
coding with references (basic new.env() tricks). I presume that my
cleaning up returned the temporary data (as evidenced by the gc output
at the start of the second round of processing). Is it possible that it
was not really cleaned up and is sitting around somewhere even though
gc() thinks it has been returned?
Thanks - any clues to follow up will be very helpful.
Nawaaz
More information about the R-help
mailing list