[Rd] Custom C finalizers for .Call

Jeroen Ooms jeroen.ooms at stat.ucla.edu
Tue Nov 24 00:10:12 CET 2015

WRE explains that R_alloc() can be used to allocate memory which
automatically gets released by R at the end of a .C, .Call or
.External, even in the case of an error or interruption. This is a
really great feature to prevent memory leaks. I was wondering if there
is a way to extend this mechanism to allow for automatically running
UNPROTECT and custom finalizers at the end of a .Call as well.

Currently it is all to easy for package authors to introduce a memory
leak or stack imbalance by calling Rf_error() or
R_CheckUserInterrupt() in a way that skips over the usual cleanup
steps. This holds especially for packages interfacing C libraries
(libcurl, libxml2, openssl, etc) which require xx_new() and xx_free()
functions to allocate/free various types of objects, handles and
contexts. Therefore we cannot use R_alloc() and we need to manually
clean up when returning, which is tricky for irregular exits.

Moreover package authors might benefit from an alternative of
allocVector() which automatically protects the SEXP until the .Call is
done. Perhaps I don't fully appreciate the complexity of the garbage
collector, but one could imagine a variant of PROTECT() which
automatically keeps a counter 'n' for the number of allocated objects
and makes R run UNPROTECT(n) when .Call exists, along with releasing
the R_alloc() memory. Yes, there are cases where it is useful to have
manual control over what can be collected earlier during the .Call
procedure, but these are rare. A lot of C code in packages might
become safer and cleaner if authors would have an option to let this
be automated.

The most general feature would a hook for adding custom C functions to
the .Call exit, similar to on.exit() in R:

  xmlNodePtr *node =  xmlNewNode(...);
  Rf_on_exit(xmlFreeNode, node);
  EVP_PKEY_CTX *ctx = EVP_PKEY_CTX_new(...);
  Rf_on_exit(EVP_PKEY_CTX_free, ctx);
  SEXP out = PROTECT(allocVector(...));
  Rf_on_exit(UNPROTECT, 1);

I don't know R's internals well enough to estimate if something like
this would be possible. I did put together a simple C example of a
linked list with object pointers and their corresponding free
functions, which can easily be free'd with a single call:
http://git.io/vBqRA . So basically what is mostly missing at this
point is a way to trigger this at the end of the .Call in a way that
works for regular returns, errors and interruptions...

More information about the R-devel mailing list