[Rd] Interpreting R memory profiling statistics from Rprof() and gc()
Tomas Kalibera
tomas.kalibera at gmail.com
Mon May 29 15:09:19 CEST 2017
On 05/18/2017 06:54 PM, Joy wrote:
> Sorry, this might be a really basic question, but I'm trying to interpret
> the results from memory profiling, and I have a few questions (marked by
> *Q#*).
>
> From the summaryRprof() documentation, it seems that the four columns of
> statistics that are reported when setting memory.profiling=TRUE are
> - vector memory in small blocks on the R heap
> - vector memory in large blocks (from malloc)
> - memory in nodes on the R heap
> - number of calls to the internal function duplicate in the time interval
> (*Q1:* Are the units of the first 3 stats in bytes?)
In Rprof.out, vector memory in small and large blocks is given in 8-byte
units (for historical reasons), but memory in nodes is given in bytes -
this is not documented/guaranteed in documentation. In
summaryRprof(memory="both"), memory usage is given in megabytes as
documented.
For summaryRprof(memory="stats" and memory="tseries") I clarified in
r72743, now memory usage is in bytes and it is documented.
>
> and from the gc() documentation, the two rows represent
> - ‘"Ncells"’ (_cons cells_), usually 28 bytes each on 32-bit systems and 56
> bytes on 64-bit systems,
> - ‘"Vcells"’ (_vector cells_, 8 bytes each)
> (*Q2:* how are Ncells and Vcells related to small heap/large heap/memory in
> nodes?)
Ncells describe memory in nodes (Ncells is the number of nodes).
Vcells describe memory in "small heap" + "large heap". A Vcell today
does not have much meaning, it is shown for historical reasons, but the
interesting thing is that Vcells*56 (or 28 on 32-bit systems) gives the
number of bytes in "small heap"+"large heap" objects.
> And I guess the question that lead to these other questions is - *Q3:* I'd
> like to plot out the total amount of memory used over time, and I don't
> think Rprofmem() give me what I'd like to know because, as I'm
> understanding it, Rprofmem() records the amount of memory allocated with
> each call, but this doesn't tell me the total amount of memory R is using,
> or am I mistaken?
Rprof controls a sampling profiler which regularly asks the GC how much
memory is currently in use on the R heap (but beware, indeed some of
that memory is no longer reachable but has not yet been collected -
running gc more frequently helps, and some of the memory may still be
reachable but will not be used anymore). You can get this data by
summaryRprof(memory="tseries") and plot them - add columns 1+2 or 1+2+3
depending on what you want, in 72743 or more recent, in older version
you need to multiply columns 1 and 2 by 8. To run the GC more frequently
you can use gctorture.
Or if you are happy modifying your own R code and you don't insist on
querying the memory size very frequently, you can also explicitly call
gc(verbose=T) repeatedly. For this you won't need to use the profiler.
If you were looking instead at how much memory the whole R instance was
using (that is, including memory allocated by the R gc but not presently
used for R objects, including memory outside R heap), the easiest way
would be to use facilities of your OS.
Rprofmem is a different thing and won't help you.
Best
Tomas
>
> Thanks in advance!
>
> Joy
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
More information about the R-devel
mailing list