Rprof {utils} | R Documentation |
Enable Profiling of R's Execution
Description
Enable or disable profiling of the execution of R expressions.
Usage
Rprof(filename = "Rprof.out", append = FALSE, interval = 0.02,
memory.profiling = FALSE, gc.profiling = FALSE,
line.profiling = FALSE, filter.callframes = FALSE,
numfiles = 100L, bufsize = 10000L,
event = c("default", "cpu", "elapsed"))
Arguments
filename |
The file to be used for recording the profiling results.
Set to |
append |
logical: should the file be over-written or appended to? |
interval |
real: distance (time interval) between samples in seconds. |
memory.profiling |
logical: write memory use information to the file? |
gc.profiling |
logical: record whether GC is running? |
line.profiling |
logical: write line locations to the file? |
filter.callframes |
logical: filter out intervening call frames of the call tree. See the filtering out call frames section. |
numfiles , bufsize |
integers: line profiling memory allocation |
event |
character: profiling event, character vector of length one,
|
Details
Enabling profiling automatically disables any existing profiling to another or the same file.
Profiling works by writing out the call stack every interval
seconds (units of the profiling event), to the file specified. Either the
summaryRprof
function or the wrapper script R CMD
Rprof
can be used to process the output file to produce a summary of the
usage; use R CMD Rprof --help
for usage information.
Exactly what is measured is subtle and depends on the profiling event.
With "elapsed"
(the default and only supported event on Windows): it
is time that the R process is running and executing an R command. It is
not however just CPU time, for if readline()
is waiting for input,
that counts as well. It is also known as ‘elapsed’ time.
With "cpu"
(the default on Unix and typically the preferred event
for identifying performance bottlenecks), it is CPU time of the R process,
so for example excludes time when R is waiting for input or for processes
run by system
to return. It may go slower than
"elapsed"
when the process is often waiting for I/O to finish, but it
may go faster with actively computing concurrent threads (say via OpenMP)
on a multi-core system.
Note that the (timing) interval cannot be too small. With "cpu"
,
the time spent in each profiling step is currently added to the interval.
With all profiling events, the computation in each profiling step causes
perturbation to the observed system and biases the results. What is
feasible is machine-dependent. On Linux, R requires the interval to be at
least 10ms, on all other platforms at least 1ms. Shorter intervals will
be rounded up with a warning.
The "default"
profiling event is "elapsed"
on Windows and
"cpu"
on Unix.
Support for "elapsed"
event on Unix is new and considered
experimental. To reduce the risk of missing a sample, R tries to use the
(real-time) FIFO scheduling policy with the maximum scheduling priority
for an internal thread which initiates collection of each sample. If
setting that priority fails, it tries to use the maximum scheduling
priority of the current scheduling policy, falling back to the current
scheduling parameters. On Linux, regular users are typically not allowed
to use the real-time scheduling priorities. This can be usually allowed
via PAM (e.g. ‘/etc/security/limits.conf’), see the OS documentation
for details. The priorities only matter when profiling a system under
high load.
Functions will only be recorded in the profile log if they put a
context on the call stack (see sys.calls
). Some
primitive functions do not do so: specifically those which are
of type "special"
(see the ‘R Internals’ manual
for more details).
Individual statements will be recorded in the profile log if
line.profiling
is TRUE
, and if the code being executed
was parsed with source references. See parse
for a
discussion of source references. By default the statement locations
are not shown in summaryRprof
, but see that help page
for options to enable the display.
Filtering Out Call Frames
Lazy evaluation makes the call stack more complex because intervening
call frames are created between the time arguments are applied to a
function, and the time they are effectively evaluated. When the call
stack is represented as a tree, these intervening frames appear as
sibling nodes. For instance, evaluating try(EXPR)
produces the
following call tree, at the time EXPR
gets evaluated:
1. +-base::try(EXPR) 2. | \-base::tryCatch(...) 3. | \-base:::tryCatchList(expr, classes, parentenv, handlers) 4. | \-base:::tryCatchOne(expr, names, parentenv, handlers[[1L]]) 5. | \-base:::doTryCatch(return(expr), name, parentenv, handler) 6. \-EXPR
Lines 2 to 5 are intervening call frames, the last of which finally
triggered evaluation of EXPR
. Setting filter.callframes
to TRUE
simplifies the profiler output by removing all sibling
nodes of intervening frames.
The same kind of call frame filtering is applied with eval()
frames. When you call eval()
, two frames are pushed on the
stack to ensure a continuity between frames. Say we have these
definitions:
calling <- function() evaluator(quote(called()), environment()) evaluator <- function(expr, env) eval(expr, env) called <- function() EXPR()
calling()
calls called()
in its own environment, via
eval()
. The latter is called indirectly through
evaluator()
. The net effect of this code is identical to just
calling called()
directly, without the intermediaries. However,
the full call stack looks like this:
1. calling() 2. \-evaluator(quote(called()), environment()) 3. \-base::eval(expr, env) 4. \-base::eval(expr, env) 5. \-called() 6. \-EXPR()
When call frame filtering is turned on, the true calling environment
of called()
is looked up, and the filtered call stack looks like
this:
1. calling() 5. \-called() 6. \-EXPR()
If the calling environment is not on the stack, the function called by
eval()
becomes a root node. Say we have:
calling <- function() evaluator(quote(called()), new.env())
With call frame filtering we then get the following filtered call stack:
5. called() 6. \-EXPR()
Note
- On Unix-alikes:
-
Profiling is not available on all platforms. By default, support for profiling is compiled in if possible – configure R with --disable-R-profiling to change this.
As R CPU profiling uses the same mechanisms as C profiling, the two cannot be used together, so do not use
Rprof(event = "cpu")
(the default) in an executable built for C-level profiling (such as using the GCC option -p or -pg). - On Windows:
-
filename
can be a UTF-8-encoded filepath that cannot be translated to the current locale.
The profiler interrupts R asynchronously, and it cannot allocate
memory to store results as it runs. This affects line profiling,
which needs to store an unknown number of file pathnames. The
numfiles
and bufsize
arguments control the size of
pre-allocated buffers to hold these results: the former counts the
maximum number of paths, the latter counts the numbers of bytes in
them. If the profiler runs out of space it will skip recording the
line information for new files, and issue a warning when
Rprof(NULL)
is called to finish profiling.
See Also
The chapter on “Tidying and profiling R code” in
‘Writing R Extensions’: RShowDoc("R-exts")
.
summaryRprof
to analyse the output file.
tracemem
, Rprofmem
for other ways to track
memory use.
Examples
## Not run: Rprof()
## some code to be profiled
Rprof(NULL)
## some code NOT to be profiled
Rprof(append = TRUE)
## some code to be profiled
Rprof(NULL)
## ...
## Now post-process the output as described in Details
## End(Not run)