[Rd] Rprof ignores top-level computation (PR#6883)
    garvin at cs.rice.edu 
    garvin at cs.rice.edu
       
    Thu May 13 23:21:26 CEST 2004
    
    
  
Full_Name: John Garvin
Version: 1.9.0
OS: Linux
Submission from: (NULL) (128.42.129.78)
This may or may not technically be a bug, but it's certainly an annoyance.
Rprof only takes into account computation that occurs inside functions. If a
time-consuming operation occurs outside a function, it doesn't record the time
it takes. Consider this program 'array.r':
Rprof()
foo <- array(1:1000000, dim=c(100,50,200))
bar <- c(1:1000, dim=c(1000000,3))
for (i in 1:100) {
  foo[bar>42] <- foo[bar>42]
}
Rprof(NULL)
Here's a comparison of the running time measured through the OS and the time
measured by Rprof:
$ time R --slave < array.r
real    0m16.068s
user    0m14.060s
sys     0m1.550s
$ R CMD Rprof Rprof.out
 
Each sample represents 0.02 seconds.
Total run time: 0.12 seconds.
 
Total seconds: time spent in function and callees.
Self seconds: time spent in function alone.
 
   %       total       %       self
 total    seconds     self    seconds    name
 66.67      0.08     66.67      0.08     "as.vector"
 66.67      0.08      0.00      0.00     "array"
 33.33      0.04     33.33      0.04     ">"
 
   %       self        %       total
 self     seconds    total    seconds    name
 66.67      0.08     66.67      0.08     "as.vector"
 33.33      0.04     33.33      0.04     ">"
Rprof gives a much more sensible result when the same code is wrapped in a
function definition, like this:
Rprof()
f <- function() {
  foo <- array(1:1000000, dim=c(100,50,200))
  bar <- c(1:1000, dim=c(1000000,3))
  for (i in 1:100) {
    foo[bar>42] <- foo[bar>42]
  }
}
f()
Rprof(NULL)
$ time R --slave < array.r
 
real    0m16.436s
user    0m14.110s
sys     0m1.490s
$ R CMD Rprof Rprof.out
 
Each sample represents 0.02 seconds.
Total run time: 14.4799999999998 seconds.
 
Total seconds: time spent in function and callees.
Self seconds: time spent in function alone.
 
   %       total       %       self
 total    seconds     self    seconds    name
100.00     14.48     99.17     14.36     "f"
  0.55      0.08      0.55      0.08     "as.vector"
  0.55      0.08      0.00      0.00     "array"
  0.28      0.04      0.28      0.04     ">"
 
   %       self        %       total
 self     seconds    total    seconds    name
 99.17     14.36    100.00     14.48     "f"
  0.55      0.08      0.55      0.08     "as.vector"
  0.28      0.04      0.28      0.04     ">"
All that time disappears when the code is not in a function. Rprof may have been
written to work that way--I don't know--but in any case its results with certain
inputs are pretty misleading.
    
    
More information about the R-devel
mailing list