[R] System.time
Wacek Kusnierczyk
Waclaw.Marcin.Kusnierczyk at idi.ntnu.no
Wed Feb 18 23:51:13 CET 2009
Stavros Macrakis wrote:
> On Thu, Feb 12, 2009 at 4:28 AM, Gavin Simpson <gavin.simpson at ucl.ac.uk> wrote:
>
>> When I'm testing the speed of things like this (that are in and of themselves
>> very quick) for situations where it may matter, I wrap the function call in a call
>> to replicate():
>>
>> system.time(replicate(1000, svd(Mean_svd_data)))
>>
>> to run it 1000 times, and that allows me to judge how quickly the
>> function executes.
>>
>
> I do the same, but with a small twist:
>
> system.time(replicate(1000, {svd(Mean_svd_data); 0} ))
>
> This allows the values of svd(...) to be garbage collected.
>
> If you don't do this and the output of the timed code is large, you
> may allocate large amounts of memory (which may influence your timing
> results) or run out of memory (which will also influence your timing
> results :-) ),
>
>
to contribute my few cents, here's a simple benchmarking routine,
inspired by the perl module Benchmark. it allows one to benchmark an
arbitrary number of expressions with an arbitrary number of
replications, and provides a summary matrix with selected timings.
the code below is also available from google code [1], if anyone is
interested in updates (should there be any) or contributions.
benchmark = function(
...,
columns=c('test', 'replications', 'user.self', 'sys.self',
'elapsed', 'user.child', 'sys.child'),
replicate=100,
environment=parent.frame()) {
arguments = match.call()[-1]
parameters = names(arguments)
if (is.null(parameters))
parameters = as.character(arguments)
else {
indices = ! parameters %in% c('columns', 'replicate', 'environment')
arguments = arguments[indices]
parameters = parameters[indices] }
result = cbind(
test=rep(ifelse(parameters=='', as.character(arguments),
parameters), each=length(replicate)),
as.data.frame(
do.call(rbind,
lapply(arguments,
function(argument)
do.call(rbind,
lapply(replicate,
function(count)
c(replications=count,
system.time(replicate(count, {
eval(argument, environment); NULL })))))))))
result[, columns, drop=FALSE] }
it's rudimentary and not fool-proof, but might be helpful if used with
care. (the nested do.call-rbind-lapply sequence can surely be
simplified, but i could not resist the pattern. someone once wrote that
if you need more than three (five?) levels of indentation in your code,
there must be something wrong with it; presumably, he was a fortran
programmer.)
examples:
benchmark(1:10^7)
# test replications user.self sys.self elapsed user.child sys.child
# 1 1:10^7 100 2.168 0 2.166 0 0
benchmark(allocation=1:10^8, replicate=10)
# test replications user.self sys.self elapsed user.child sys.child
# 1 allocation 10 0.98 3.073 4.05 0 0
means.rep = function(n, m) replicate(n, mean(rnorm(m)))
means.pat = function(n, m) colMeans(array(rnorm(n*m), c(m, n)))
(result = benchmark(replicate=c(10, 100, 1000),
rep=means.rep(100, 100),
pat=means.pat(100, 100),
columns=c('test', 'replications', 'elapsed')))
# test replications elapsed
# 1 rep 10 0.037
# 2 rep 100 0.387
# 3 rep 1000 3.840
# 4 pat 10 0.017
# 5 pat 100 0.170
# 6 pat 1000 1.731
result$elapsed/result$replications
# [1] 0.003700 0.003870 0.003840 0.001700 0.001700 0.001731
with(result, t.test(elapsed/replications ~ test, paired=TRUE))
# silly, i know...
manual on demand.
vQ
[1] http://code.google.com/p/rbenchmark/
More information about the R-help
mailing list