[Rd] gc(reset=TRUE) reset timing

Stavros Macrakis macrakis at alum.mit.edu
Wed Feb 11 17:53:50 CET 2009


The man page for gc reads:

     The final two columns show the maximum space used since the last
     call to 'gc(reset=TRUE)' (or since R started).

The word 'last' here is ambiguous: does it include the *current* call
to gc?  When I first read this, I assumed that it did not; indeed, I
only realized that there was an ambiguity after trying to use
reset=TRUE for some measurements and discovering that the
implementation corresponds to a reading where it *does* include the
current call to gc. (see below for transcript)

The interpretation that 'last' does *not* include the current call
seems more useful than the implemented behavior, since with the
implemented behavior, to report maximum memory usage for each of a
series of operations, you need to make *2* calls to GC between
operations, one with reset=FALSE (to return the maximum space used)
and one with reset=TRUE (to perform the reset for the next
measurement).

So I suggest that the reset happen only *after* the "max used"
calculation.  Alternatively, the documentation could be clarified.

           -s

{gc(reset=T); numeric(10^7); gc(reset=FALSE)["Vcells","max used"]}
[1] 10353316                        <<< shows max used by numeric

> {gc(reset=T); numeric(10^7); gc(reset=TRUE)["Vcells","max used"]}
[1] 353329                            <<< does not show max used by numeric



More information about the R-devel mailing list