[R] Timings of function execution in R [was Re: R in Industry]
Prof Brian Ripley
ripley at stats.ox.ac.uk
Fri Feb 9 19:33:15 CET 2007
> x <- rnorm(10000)
> system.time(for(i in 1:1000) pmax(x, 0))
user system elapsed
4.43 0.05 4.54
> pmax2 <- function(k,x) (x+k + abs(x-k))/2
> system.time(for(i in 1:1000) pmax2(x, 0))
user system elapsed
0.64 0.03 0.67
> pm <- function(x) {z <- x<0; x[z] <- 0; x}
> system.time(for(i in 1:1000) pm(x))
user system elapsed
0.59 0.00 0.59
> system.time(for(i in 1:1000) pmax.int(x, 0))
user system elapsed
0.36 0.00 0.36
So at least on one system Thomas' solution is a little faster, but a
C-level n-args solution handling NAs is quite a lot faster.
On Fri, 9 Feb 2007, Martin Maechler wrote:
>>>>>> "TL" == Thomas Lumley <tlumley at u.washington.edu>
>>>>>> on Fri, 9 Feb 2007 08:13:54 -0800 (PST) writes:
>
> TL> On 2/9/07, Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:
> >>> The other reason why pmin/pmax are preferable to your functions is that
> >>> they are fully generic. It is not easy to write C code which takes into
> >>> account that <, [, [<- and is.na are all generic. That is not to say that
> >>> it is not worth having faster restricted alternatives, as indeed we do
> >>> with rep.int and seq.int.
> >>>
> >>> Anything that uses arithmetic is making strong assumptions about the
> >>> inputs. It ought to be possible to write a fast C version that worked for
> >>> atomic vectors (logical, integer, real and character), but is there
> >>> any evidence of profiled real problems where speed is an issue?
>
>
> TL> I had an example just last month of an MCMC calculation where profiling showed that pmax(x,0) was taking about 30% of the total time. I used
>
> TL> function(x) {z <- x<0; x[z] <- 0; x}
>
> TL> which was significantly faster. I didn't try the
> TL> arithmetic solution.
>
> I did - eons ago as mentioned in my message earlier in this
> thread. I can assure you that those (also mentioned)
>
> pmin2 <- function(k,x) (x+k - abs(x-k))/2
> pmax2 <- function(k,x) (x+k + abs(x-k))/2
>
> are faster still, particularly if you hardcode the special case of k=0!
> {that's how I came about these: pmax(x,0) is also denoted x_+, and
> x_+ := (x + |x|)/2
> x_- := (x - |x|)/2
> }
>
> TL> Also, I didn't check if a solution like this would still
> TL> be faster when both arguments are vectors (but there was
> TL> a recent mailing list thread where someone else did).
>
> indeed, and they are faster.
> Martin
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list