[R] Timings of function execution in R [was Re: R in Industry]
Duncan Murdoch
murdoch at stats.uwo.ca
Fri Feb 9 19:52:25 CET 2007
On 2/9/2007 1:33 PM, Prof Brian Ripley wrote:
>> x <- rnorm(10000)
>> system.time(for(i in 1:1000) pmax(x, 0))
> user system elapsed
> 4.43 0.05 4.54
>> pmax2 <- function(k,x) (x+k + abs(x-k))/2
>> system.time(for(i in 1:1000) pmax2(x, 0))
> user system elapsed
> 0.64 0.03 0.67
>> pm <- function(x) {z <- x<0; x[z] <- 0; x}
>> system.time(for(i in 1:1000) pm(x))
> user system elapsed
> 0.59 0.00 0.59
>> system.time(for(i in 1:1000) pmax.int(x, 0))
> user system elapsed
> 0.36 0.00 0.36
>
> So at least on one system Thomas' solution is a little faster, but a
> C-level n-args solution handling NAs is quite a lot faster.
For this special case we can do a lot better using
pospart <- function(x) (x + abs(x))/2
The less specialized function
pmax2 <- function(x,y) {
diff <- x - y
y + (diff + abs(diff))/2
}
is faster on my system than pm, but not as fast as pospart:
> system.time(for(i in 1:1000) pm(x))
[1] 0.77 0.01 0.78 NA NA
> system.time(for(i in 1:1000) pospart(x))
[1] 0.27 0.02 0.28 NA NA
> system.time(for(i in 1:1000) pmax2(x,0))
[1] 0.47 0.00 0.47 NA NA
Duncan Murdoch
>
> On Fri, 9 Feb 2007, Martin Maechler wrote:
>
>>>>>>> "TL" == Thomas Lumley <tlumley at u.washington.edu>
>>>>>>> on Fri, 9 Feb 2007 08:13:54 -0800 (PST) writes:
>>
>> TL> On 2/9/07, Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:
>> >>> The other reason why pmin/pmax are preferable to your functions is that
>> >>> they are fully generic. It is not easy to write C code which takes into
>> >>> account that <, [, [<- and is.na are all generic. That is not to say that
>> >>> it is not worth having faster restricted alternatives, as indeed we do
>> >>> with rep.int and seq.int.
>> >>>
>> >>> Anything that uses arithmetic is making strong assumptions about the
>> >>> inputs. It ought to be possible to write a fast C version that worked for
>> >>> atomic vectors (logical, integer, real and character), but is there
>> >>> any evidence of profiled real problems where speed is an issue?
>>
>>
>> TL> I had an example just last month of an MCMC calculation where profiling showed that pmax(x,0) was taking about 30% of the total time. I used
>>
>> TL> function(x) {z <- x<0; x[z] <- 0; x}
>>
>> TL> which was significantly faster. I didn't try the
>> TL> arithmetic solution.
>>
>> I did - eons ago as mentioned in my message earlier in this
>> thread. I can assure you that those (also mentioned)
>>
>> pmin2 <- function(k,x) (x+k - abs(x-k))/2
>> pmax2 <- function(k,x) (x+k + abs(x-k))/2
>>
>> are faster still, particularly if you hardcode the special case of k=0!
>> {that's how I came about these: pmax(x,0) is also denoted x_+, and
>> x_+ := (x + |x|)/2
>> x_- := (x - |x|)/2
>> }
>>
>> TL> Also, I didn't check if a solution like this would still
>> TL> be faster when both arguments are vectors (but there was
>> TL> a recent mailing list thread where someone else did).
>>
>> indeed, and they are faster.
>> Martin
>>
>
More information about the R-help
mailing list