[R] Timings of function execution in R [was Re: R in Industry]

Gabor Grothendieck ggrothendieck at gmail.com
Fri Feb 9 00:29:49 CET 2007


This may not be exactly the same to the last decimal but is nearly
twice as fast again:

> set.seed(1)
> n <- 1000000
> x <- rnorm(n)
> y <- rnorm(n)
> system.time({z <- x > y; z*x+(!z)*y})
   user  system elapsed
   0.64    0.08    0.72
> system.time({z <- x > y; z * (x-y) + y})
   user  system elapsed
   0.35    0.04    0.39

On 2/8/07, Douglas Bates <bates at stat.wisc.edu> wrote:
> On 2/8/07, Albrecht, Dr. Stefan (AZ Private Equity Partner)
> <stefan.albrecht at apep.com> wrote:
> > Dear all,
> >
> > Thanks a lot for your comments.
> >
> > I very well agree with you that writing efficient code is about optimisation. The most important rules I know would be:
> > - vectorization
> > - pre-definition of vectors, etc.
> > - use matrix instead of data.frame
> > - do not use named objects
> > - use pure matrix instead of involved S4 (perhaps also S3) objects (can have enormous effects)
> > - use function instead of expression
> > - use compiled code
> > - I guess indexing with numbers (better variables) is also much faster than with text (names) (see also above)
> > - I even made, for example, my own min, max, since they are slow, e.g.,
> >
> > greaterOf <- function(x, y){
> > # Returns for each element of x and y (numeric)
> > # x or y may be a multiple of the other
> >   z <- x > y
> >   z*x + (!z)*y
>
> That's an interesting function.  I initially was tempted to respond
> that you have managed to reinvent a specialized form of the ifelse
> function but then I decided to do the timings just to check (always a
> good idea).  The enclosed timings show that your function is indeed
> faster than a call to ifelse.  A couple of comments:
>
> - I needed to make the number of components in the vectors x and y
> quite large before I could  get reliable timings on the system I am
> using.
>
> - The recommended way of doing timings is with system.time function,
> which makes an effort to minimize the effects of garbage collection on
> the timings.
>
> - Even when using system.time there is often a big difference in
> timing between the first execution of a function call that generates a
> large object and subsequent executions of the same function call.
>
> [additional parts of the original message not relevant to this
> discussion have been removed]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>



More information about the R-help mailing list