[Rd] Suggestion: 20% speed up of which() with two-character mod
Henrik Bengtsson
hb at stat.berkeley.edu
Fri Jul 11 05:18:30 CEST 2008
Hi,
by replacing 'll' with 'wh' in the source code for base::which() one
gets ~20% speed up for *named logical vectors*.
CURRENT CODE:
which <- function(x, arr.ind = FALSE)
{
if(!is.logical(x))
stop("argument to 'which' is not logical")
wh <- seq_along(x)[ll <- x & !is.na(x)]
m <- length(wh)
dl <- dim(x)
if (is.null(dl) || !arr.ind) {
names(wh) <- names(x)[ll]
}
...
wh;
}
SUGGESTED CODE: (Remove 'll' and use 'wh')
which2 <- function(x, arr.ind = FALSE)
{
if(!is.logical(x))
stop("argument to 'which' is not logical")
wh <- seq_along(x)[x & !is.na(x)]
m <- length(wh)
dl <- dim(x)
if (is.null(dl) || !arr.ind) {
names(wh) <- names(x)[wh]
}
...
wh;
}
That's all.
BENCHMARKING:
# To measure both in same environment
which1 <- base::which;
environment(which1) <- globalenv(); # Needed?
N <- 1e6;
set.seed(0xbeef);
x <- sample(c(TRUE, FALSE), size=N, replace=TRUE);
names(x) <- seq_along(x);
B <- 10;
t1 <- system.time({ for (bb in 1:B) idxs1 <- which1(x); });
t2 <- system.time({ for (bb in 1:B) idxs2 <- which2(x); });
stopifnot(identical(idxs1, idxs2));
print(t1/t2);
# Fair benchmarking
t2 <- system.time({ for (bb in 1:B) idxs2 <- which2(x); });
t1 <- system.time({ for (bb in 1:B) idxs1 <- which1(x); });
print(t1/t2);
## user system elapsed
## 1.283186 1.052632 1.250000
You get similar results if you put for loop outside the system.time()
call (and sum up the timings).
Cheers
Henrik
More information about the R-devel
mailing list