[Rd] Rprof ignores top-level computation (PR#6883)

garvin at cs.rice.edu garvin at cs.rice.edu
Thu May 13 23:21:26 CEST 2004


Full_Name: John Garvin
Version: 1.9.0
OS: Linux
Submission from: (NULL) (128.42.129.78)


This may or may not technically be a bug, but it's certainly an annoyance.

Rprof only takes into account computation that occurs inside functions. If a
time-consuming operation occurs outside a function, it doesn't record the time
it takes. Consider this program 'array.r':

Rprof()
foo <- array(1:1000000, dim=c(100,50,200))
bar <- c(1:1000, dim=c(1000000,3))
for (i in 1:100) {
  foo[bar>42] <- foo[bar>42]
}
Rprof(NULL)

Here's a comparison of the running time measured through the OS and the time
measured by Rprof:

$ time R --slave < array.r

real    0m16.068s
user    0m14.060s
sys     0m1.550s
$ R CMD Rprof Rprof.out
 
Each sample represents 0.02 seconds.
Total run time: 0.12 seconds.
 
Total seconds: time spent in function and callees.
Self seconds: time spent in function alone.
 
   %       total       %       self
 total    seconds     self    seconds    name
 66.67      0.08     66.67      0.08     "as.vector"
 66.67      0.08      0.00      0.00     "array"
 33.33      0.04     33.33      0.04     ">"
 
   %       self        %       total
 self     seconds    total    seconds    name
 66.67      0.08     66.67      0.08     "as.vector"
 33.33      0.04     33.33      0.04     ">"

Rprof gives a much more sensible result when the same code is wrapped in a
function definition, like this:

Rprof()
f <- function() {
  foo <- array(1:1000000, dim=c(100,50,200))
  bar <- c(1:1000, dim=c(1000000,3))
  for (i in 1:100) {
    foo[bar>42] <- foo[bar>42]
  }
}
f()
Rprof(NULL)

$ time R --slave < array.r
 
real    0m16.436s
user    0m14.110s
sys     0m1.490s
$ R CMD Rprof Rprof.out
 
Each sample represents 0.02 seconds.
Total run time: 14.4799999999998 seconds.
 
Total seconds: time spent in function and callees.
Self seconds: time spent in function alone.
 
   %       total       %       self
 total    seconds     self    seconds    name
100.00     14.48     99.17     14.36     "f"
  0.55      0.08      0.55      0.08     "as.vector"
  0.55      0.08      0.00      0.00     "array"
  0.28      0.04      0.28      0.04     ">"
 
   %       self        %       total
 self     seconds    total    seconds    name
 99.17     14.36    100.00     14.48     "f"
  0.55      0.08      0.55      0.08     "as.vector"
  0.28      0.04      0.28      0.04     ">"

All that time disappears when the code is not in a function. Rprof may have been
written to work that way--I don't know--but in any case its results with certain
inputs are pretty misleading.



More information about the R-devel mailing list