[Rd] setequal: better readability, reduced memory footprint, and minor speedup
Hervé Pagès
hpages at fredhutch.org
Tue Jan 6 22:02:31 CET 2015
Hi,
Current implementation:
setequal <- function (x, y)
{
x <- as.vector(x)
y <- as.vector(y)
all(c(match(x, y, 0L) > 0L, match(y, x, 0L) > 0L))
}
First what about replacing 'match(x, y, 0L) > 0L' and 'match(y, x, 0L) > 0L'
with 'x %in% y' and 'y %in% x', respectively. They're strictly
equivalent but the latter form is a lot more readable than the former
(isn't this the "raison d'être" of %in%?):
setequal <- function (x, y)
{
x <- as.vector(x)
y <- as.vector(y)
all(c(x %in% y, y %in% x))
}
Furthermore, replacing 'all(c(x %in% y, y %in x))' with
'all(x %in% y) && all(y %in% x)' improves readability even more and,
more importantly, reduces memory footprint significantly on big vectors
(e.g. by 15% on integer vectors with 15M elements):
setequal <- function (x, y)
{
x <- as.vector(x)
y <- as.vector(y)
all(x %in% y) && all(y %in% x)
}
It also seems to speed up things a little bit (not in a significant
way though).
Cheers,
H.
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fredhutch.org
Phone: (206) 667-5791
Fax: (206) 667-1319
More information about the R-devel
mailing list