[R] Architecting an optimization with external calls

Prof Brian Ripley ripley at stats.ox.ac.uk
Tue Nov 4 22:12:56 CET 2003

Look into external pointers.  That is how we have tackled this, e.g. in 
the ts package.

On Tue, 4 Nov 2003, Ross Boylan wrote:

> I have a likelihood I would like to compute using C++ and then
> optimize.  It has data that need to persist across individual calls to
> the likelihood.  I'd appreciate any recommendations about the best way
> to do this.  There are several, related issues.
> 1. Use the R optimizer or a C optimizer?
> Because of the persistence problems (see below), using a C optimizer has
> a certain attraction.  However, the C methods described in 5.8 of the
> "Writing R Extensions" include the caveat that "No function is provided
> for finite-differencing, nor for approximating the Hessian at the
> result."  That's a big drawback, since I need that information. 
> (Probably I will be doing this without analytic derivatives.)
> 2. How to persist the data?
> I think my preferred approach would be to pass data back to R (assuming
> the "optimize with R" approach above), and then pass it on to subsequent
> calls.  The data would be the top of an object graph (i.e., there are
> pointers to disconnected chunks of memory) and it is not clear to me how
> to do this.  First, the documentation doesn't indicate any "opaque" data
> type; should I use character (STRXP)?  Second, I'm not sure how to
> protect it and the other chunks of memory.  Does each one need to go
> inside a PROTECT call?  And is it safe to have one invocation from R do
> PROTECT, and another much later one do UNPROTECT (all the examples I saw
> had both calls within the same function invocation).
> My hope is that if I allocate an object outside of R and don't tell R
> about it, R will never touch it.  So I only need PROTECT for something
> going back to R.  True?
> Also, the docs say not to protect too many items; there may be a lot. 
> So I'd probably end up having to write my own alloc out of pools that
> were protected, and that's just another layer of junk in terms of the
> original problem.
> Another approach would be to just hang the data somewhere in the global
> space of the shared library.  On general principles this is a poor
> approach ("don't use globals"), manifest in specific failings such as
> lack of thread safety.  I also suspect the issues with getting that to
> work portably are probably considerable (as in, it may not be possible).
> P.S. The example of Zero-finding (4.9.1 in "Writing R Extensions") is,
> unfortunately, the reverse of this case.  In the example, the function
> to be optimized is in R, while the optimizer is in C.

Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

More information about the R-help mailing list