[R] memory management

Milan Bouchet-Valat nalimilan at club.fr
Wed Feb 29 18:18:50 CET 2012


Le mercredi 29 février 2012 à 11:42 -0500, Sam Steingold a écrit :
> > * William Dunlap <jqhaync at gvopb.pbz> [2012-02-28 23:06:54 +0000]:
> >
> > You need to walk through the objects, checking for environments on
> > each component or attribute of an object.
> 
> so why doesn't object.size do that?
> 
> >   > f <- function(n) {
> >   +   d <- data.frame(y = rnorm(n), x = rnorm(n))
> >   +   lm(y ~ poly(x, 4), data=d)
> >   + }
> 
> I am not doing any modeling. No "~". No formulas.
> The whole thing is just a bunch of data frames.
> I do a lot of strsplit, unlist, & subsetting, so I could imagine why
> the RSS is triple the total size of my data if all the intermediate
> results are not released.
I think you're simply hitting a (terrible) OS limitation. Linux is very
often not able to reclaim the memory R has used because it's fragmented.
The OS can only get the pages back if nothing is above them, and most of
the time there is data after the object you remove. I'm not able to give
you a more precise explanation, but that's apparently a known problem
and that's hard to fix.

At least, I can confirm that after doing a lot of merges on big data
frames, R can keep using 3GB of shared memory on my box even if gc()
only reports 500MB currently used. Restarting R makes memory use go down
to the normal expectations.


Regards



More information about the R-help mailing list