[Rd] Speed improvement to PROTECT, UNPROTECT, etc.
Radford Neal
radford at cs.toronto.edu
Mon Aug 23 15:09:57 CEST 2010
As I mentioned in my previous message about speeding up evalList, I've
been looking at ways to speed up the R interpreter. One sees in the
code many, many calls of PROTECT, UNPROTECT, and related functions, so
that seems like an obvious target for optimization. Indeed, I've
found that one can speed up the interpreter by about 10% by just
changing these.
The functions are actually macros defined in Rinternals.h, but end up
just calling functions defined in memory.c (apparently as protect,
etc., but really as Rf_protect, etc.). So there is function call
overhead every time they are used.
To get rid of the function call overhead, without generating lots of
extra code, one can redefine the macros to handled to the common case
inline, and call the function in memory.c for the uncommon cases (eg,
error on stack underflow). Here are my versions that do this:
#define PROTECT(s) do { \
SEXP tmp_prot_sexp = (s); \
if (R_PPStackTop < R_PPStackSize) \
R_PPStack[R_PPStackTop++] = tmp_prot_sexp; \
else \
Rf_protect(tmp_prot_sexp); \
} while (0)
#define UNPROTECT(n) (R_PPStackTop >= (n) ? \
(void) (R_PPStackTop -= (n)) : Rf_unprotect(n))
#define PROTECT_WITH_INDEX(x,i) do { \
PROTECT(x); \
*(i) = R_PPStackTop - 1; \
} while (0)
#define REPROTECT(x,i) ( (void) (R_PPStack[i] = x) )
Unfortunately, one can't just change the definitions in Rinternals.h.
Some uses of PROTECT are in places where R_PPStack, etc. aren't
visible. Instead, one can redefine them at the end of Defn.h, where
these variables are declared. That alone also doesn't work, however,
because some references don't work at link time. So instead I
redefine them in Defn.h only if USE_FAST_PROTECT_MACROS is defined.
I define this before including Defn.h in all the .c files in the main
directory.
Another complication is that the PROTECT macro no longer returns its
argument. One can avoid this by writing it another way, but this then
results in its argument being referenced in two places (though only
evaluated once), which seems to slow things down, presumably because
the larger amount of code generated affects cache behaviour. Instead,
I just changed the relatively few occurrences of v = PROTECT(...) to
PROTECT(v = ...), which is the dominant idiom in the code anyway.
(In some places slightly more change is needed, as when this is in an
initializer.)
This works fine, speeding up R programs that aren't dominated by large
operations like multiplying big matrices by about 10%. The effect is
cumulative with the change I mentioned in my previous message about
avoiding extra CONS operations in evalList, for a total speedup of
about 15%.
Radford Neal
More information about the R-devel
mailing list