[Rd] anyNA() performance on vectors of POSIXct
Harvey Smith
h@rvey13131 @end|ng |rom gm@||@com
Wed May 1 09:20:55 CEST 2019
Inside of the anyNA() function, it will use the legacy any(is.na()) code if
x is an OBJECT(). If x is a vector of POSIXct, it will be an OBJECT(), but
it is also TYPEOF(x) == REALSXP. Therefore, it will skip the faster
ITERATE_BY_REGION, which is typically 5x faster in my testing.
Is the OBJECT() condition really necessary, or could it be moved after the
switch() for the individual TYPEOF(x) ITERATE_BY_REGION calls?
# script to demonstrate performance difference if x is an OBJECT or not by
using unclass()
x.posixct = Sys.time() + 1:1e6
microbenchmark::microbenchmark(
any(is.na( x.posixct )),
anyNA( x.posixct ),
anyNA( unclass(x.posixct) ),
unit='ms')
static Rboolean anyNA(SEXP call, SEXP op, SEXP args, SEXP env)
{
SEXP x = CAR(args);
SEXPTYPE xT = TYPEOF(x);
Rboolean isList = (xT == VECSXP || xT == LISTSXP), recursive = FALSE;
if (isList && length(args) > 1) recursive = asLogical(CADR(args));
*if (OBJECT(x) || (isList && !recursive)) {*
SEXP e0 = PROTECT(lang2(install("is.na"), x));
SEXP e = PROTECT(lang2(install("any"), e0));
SEXP res = PROTECT(eval(e, env));
int ans = asLogical(res);
UNPROTECT(3);
return ans == 1; // so NA answer is false.
}
R_xlen_t i, n = xlength(x);
switch (xT) {
case REALSXP:
{
if(REAL_NO_NA(x))
return FALSE;
ITERATE_BY_REGION(x, xD, i, nbatch, double, REAL, {
for (int k = 0; k < nbatch; k++)
if (ISNAN(xD[k]))
return TRUE;
});
break;
}
[[alternative HTML version deleted]]
More information about the R-devel
mailing list