[Rd] setequal: better readability, reduced memory footprint, and minor speedup

Peter Haverty haverty.peter at gene.com
Fri Jan 9 01:57:38 CET 2015


Try this out. It looks like a 2X speedup for some cases and a wash in
others.  "unique" does two allocations, but skipping the "> 0L" allocation
could make up for it.

library(microbenchmark)
library(RUnit)

x = sample.int(1e4, 1e5, TRUE)
y = sample.int(1e4, 1e5, TRUE)

set_equal <- function(x, y) {
    xu = .Internal(unique(x, FALSE, FALSE, NA))
    yu = .Internal(unique(y, FALSE, FALSE, NA))
    if (length(xu) != length(yu)) {
        return(FALSE);
    }
    return( all(match(xu, yu, 0L) > 0L) )
}

set_equal2 <- function(x, y) {
    xu = .Internal(unique(x, FALSE, FALSE, NA))
    yu = .Internal(unique(y, FALSE, FALSE, NA))
    if (length(xu) != length(yu)) {
        return(FALSE);
    }
    return( !anyNA(match(xu, yu)) )
}

microbenchmark(
    a = setequal(x, y),
    b = set_equal(x, y),
    c = set_equal2(x, y)
    )
checkIdentical(setequal(x, y), set_equal(x, y))
checkIdentical(setequal(x, y), set_equal2(x, y))

x = y
microbenchmark(
    a = setequal(x, y),
    b = set_equal(x, y),
    c = set_equal2(x, y)
    )
checkIdentical(setequal(x, y), set_equal(x, y))
checkIdentical(setequal(x, y), set_equal2(x, y))


Sorry, I'm probably over-posting today.

Regards,

	[[alternative HTML version deleted]]



More information about the R-devel mailing list