[Rd] 1.6x speedup for requal() function (in R/src/main/unique.c)
Hervé Pagès
hpages at fhcrc.org
Fri Dec 2 02:40:34 CET 2011
Hi,
FWIW:
/* Taken from R/src/main/unique.c */
static int requal(SEXP x, int i, SEXP y, int j)
{
if (i < 0 || j < 0) return 0;
if (!ISNAN(REAL(x)[i]) && !ISNAN(REAL(y)[j]))
return (REAL(x)[i] == REAL(y)[j]);
else if (R_IsNA(REAL(x)[i]) && R_IsNA(REAL(y)[j])) return 1;
else if (R_IsNaN(REAL(x)[i]) && R_IsNaN(REAL(y)[j])) return 1;
else return 0;
}
/* Between 1.34x and 1.37x faster on my 64-bit Ubuntu laptop */
static int requal2(SEXP x, int i, SEXP y, int j)
{
double xi, yj;
if (i < 0 || j < 0) return 0;
xi = REAL(x)[i];
yj = REAL(y)[j];
if (!ISNAN(xi) && !ISNAN(yj)) return xi == yj;
if (R_IsNA(xi) && R_IsNA(yj)) return 1;
if (R_IsNaN(xi) && R_IsNaN(yj)) return 1;
return 0;
}
/* Another extra 1.18x speedup. So overall requal3() is about 1.6x
faster than requal() for me. requal3() uses a simpler logic than
requal() but this logic should be equivalent to the logic used
by requal(), based on the following facts:
(a) If *one* of xi or yi is a number (i.e. not NA or NaN),
then xi and yi can be compared with xi == yi. They don't
need to *both* be numbers for this comparison to be valid.
(b) Otherwise (i.e. if each of them is not a number) then each
of them is either NA or NaN (only 2 possible values for
each), so comparing them with R_IsNA(xi) == R_IsNA(yj)
should do the trick. */
static int requal3(SEXP x, int i, SEXP y, int j)
{
double xi, yj;
if (i < 0 || j < 0) return 0;
xi = REAL(x)[i];
yj = REAL(y)[j];
if (!ISNAN(xi) || !ISNAN(yj)) return xi == yj;
return R_IsNA(xi) == R_IsNA(yj);
}
The logic of the cequal() function (in the same file) could also be
cleaned up in a similar way, probably for an even greater speedup.
This will benefit duplicated(), anyDuplicated() and unique() on numeric
and complex vectors.
Cheers,
H.
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
More information about the R-devel
mailing list