[Rd] Suggestion: 20% speed up of which() with two-character mod
Charles C. Berry
cberry at tajo.ucsd.edu
Fri Jul 11 17:57:56 CEST 2008
On Thu, 10 Jul 2008, Henrik Bengtsson wrote:
> Hi,
>
> by replacing 'll' with 'wh' in the source code for base::which() one
> gets ~20% speed up for *named logical vectors*.
The amount of speedup depends on how sparse the TRUE values are.
When the proportion of TRUEs gets small the speedup is more than twofold
on my macbook. For high proportions of TRUE, the speedup is more like the
20% you cite.
HTH,
Chuck
>
> CURRENT CODE:
>
> which <- function(x, arr.ind = FALSE)
> {
> if(!is.logical(x))
> stop("argument to 'which' is not logical")
> wh <- seq_along(x)[ll <- x & !is.na(x)]
> m <- length(wh)
> dl <- dim(x)
> if (is.null(dl) || !arr.ind) {
> names(wh) <- names(x)[ll]
> }
> ...
> wh;
> }
>
> SUGGESTED CODE: (Remove 'll' and use 'wh')
>
> which2 <- function(x, arr.ind = FALSE)
> {
> if(!is.logical(x))
> stop("argument to 'which' is not logical")
> wh <- seq_along(x)[x & !is.na(x)]
> m <- length(wh)
> dl <- dim(x)
> if (is.null(dl) || !arr.ind) {
> names(wh) <- names(x)[wh]
> }
> ...
> wh;
> }
>
> That's all.
>
> BENCHMARKING:
>
> # To measure both in same environment
> which1 <- base::which;
> environment(which1) <- globalenv(); # Needed?
>
> N <- 1e6;
> set.seed(0xbeef);
> x <- sample(c(TRUE, FALSE), size=N, replace=TRUE);
> names(x) <- seq_along(x);
> B <- 10;
> t1 <- system.time({ for (bb in 1:B) idxs1 <- which1(x); });
> t2 <- system.time({ for (bb in 1:B) idxs2 <- which2(x); });
> stopifnot(identical(idxs1, idxs2));
> print(t1/t2);
> # Fair benchmarking
> t2 <- system.time({ for (bb in 1:B) idxs2 <- which2(x); });
> t1 <- system.time({ for (bb in 1:B) idxs1 <- which1(x); });
> print(t1/t2);
> ## user system elapsed
> ## 1.283186 1.052632 1.250000
>
> You get similar results if you put for loop outside the system.time()
> call (and sum up the timings).
>
> Cheers
>
> Henrik
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
Charles C. Berry (858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901
More information about the R-devel
mailing list