[Rd] Memory allocation in C/C++ vs R?
Romain Francois
romain.francois at dbmail.com
Sat May 1 11:02:35 CEST 2010
Simon,
Le 30/04/10 20:12, Simon Urbanek a écrit :
>
> Dominick,
>
> On Apr 30, 2010, at 1:40 PM, Dominick Samperi wrote:
>
>> Just to be sure that I understand, are you suggesting that the R-safe way to do things is to not use STL, and to not use C++ memory management and exception handling? How can you leave a function in an irregular way without triggering a seg fault or something like that, in which case there is no chance for recovery anyway?
>>
>> In my experience the C++ exception stack seems to unwind properly before returning to R when there is an exception, and memory that is allocated by C++ functions seems to maintain its integrity and does not interfere with R's memory management.
>>
>> It would be helpful if you could specify what kind of interference you are referring to here between C++ exception handling and R's error handling, and why STL is dangerous and best avoided in R. I have used STL with R for a long time and have experienced no problems.
>>
>
> There are essentially two issues here that I had in mind.
>
> 1) C++ exception handling and R exceptions handling both use setjmp/longjmp with the assumption that no one else does the same. That assumption is voided when both are used so interleaving them will cause problems (you're fine if you can guarantee that they always stack but that's not always easy to achieve yet easy to miss).
>
> 2) C++ compilers assume that you cannot leave the context of a function in unusual ways. But you can, namely if an R error is raised. This affects (among others) locally allocated objects.
Thank you for these nuggets of information. It might be beneficial to
promote them to some notes in the "Interfacing C++ code" section of WRE.
As a side note for users of Rcpp. In recent versions of Rcpp, we take
precautions against the R error/c++ exception problem. We engage users
to always enclose each c++ function in explicit try/catch/forward to R,
and in Rcpp 0.8.0 (about to be released) we have macros BEGIN_RCPP /
END_RCPP.
So foo below would become :
extern "C" SEXP foo() {
BEGIN_RCPP
vector <int> a;
a.resize(-1);
END_RCPP
return R_NilValue;
}
We also have less invasive macros RCPP_FUNCTION_0, ...,
RCPP_FUNCTION_65, and RCPP_FUNCTION_VOID_0, ..., so that one can write
foo as:
RCPP_FUNCTION_VOID_0(foo){
vector <int> a;
a.resize(-1);
}
We also take care of the second problem, so that callbacks to R through
the Rcpp api are enclosed in R tryCatch blocks, through our
Evaluator::run :
SEXP Evaluator::run(SEXP expr, SEXP env) throw(eval_error) {
SEXP call = PROTECT( Rf_lang3( Rf_install("rcpp_tryCatch") , expr, env
) ) ;
Environment RCPP = Environment::Rcpp_namespace();
/* call the tryCatch call */
SEXP res = PROTECT( Rf_eval( call, RCPP ) );
/* was there an error ? */
int error = LOGICAL( Rf_eval( Rf_lang1( Rf_install("errorOccured") ),
RCPP ) )[0];
if( error ){
SEXP err_msg = PROTECT( Rf_eval(
Rf_lang1( Rf_install("getCurrentErrorMessage")),
RCPP ) );
std::string message = CHAR(STRING_ELT(err_msg,0)) ;
UNPROTECT( 3 ) ;
throw eval_error(message) ;
} else {
UNPROTECT(2) ;
return res ;
}
}
So when users of Rcpp do callbacs to R, like this:
Rcpp::Function rnorm( "rnorm" ) ;
rnorm( -10 ) ;
Rcpp catches the R error and c++ify it as a eval_error exception, so
that it can be handled in terms of c++ try/catch.
This is somewhat a hack due to the lack of explicit C level api for R
error handling.
http://www.mail-archive.com/r-devel@r-project.org/msg19300.html
Romain
> On 1:
>
> You cannot interleave R error handling and C++ exceptions. For example if there is a chance of a C++ exception you must guarantee that the exception won't leave the R context that you are in. This is easily demonstrated because R check the consistency (see ex.1). Vice versa the consequences are not easily visible, because C++ provides no tracking, but is equally fatal. If you raise R exception from C++ it does not clean up whatever C++ exception context you were it and bypasses it. But there are even more grave consequences:
>
> On 2:
>
> If you any R error from within C++ code you'll break the assumption of C++ that it has control over the entry/exit point of a function. Take a really trivial example:
>
> void foo() {
> Object o;
> // some other code ....
> error("blah")
>
> normally, the life of o is controlled by C++ and it will correctly execute its destructor when you leave the function. However, the error call in R will cause it to bypass that, the object won't be destroyed even though it was allocated on the stack. Although it's obvious in the example above, pretty much all R API function can raise errors so the same applies to any R API call - direct or indirect. As a consequence you pretty much cannot call R API function from C++ unless you are very, very careful (don't forget that C++ does a lot of things behind your back such as initializing objects, exception contexts etc. which you technically have no control over).
>
>
> As I said in my post, you can write safe C++ code, but you have to be very careful. But the point about libraries is that you have no control over what they do, so you cannot know whether they will interact in a bad way with R or not. STL is an example where only the interface is defined, the implementations are not and vary by OS, compiler etc. This makes it pretty much impossible to use it reliably since the fact that it will work on one implementation doesn't mean that it will work on another since it is the implementation details that will bite you. (I know that we had reports of things breaking due to STL but I don't remember what implementation/OS it was)
>
> [The above issue are only the ones I was pointing out, there may be others that are not covered here].
>
> Cheers,
> Simon
>
>
>
>
> ---- R context vs C++ exception example
>
>
>> dyn.load("stl.so")
>> .Call("bar")
> something went wrong somewhere in C++...
> Warning: stack imbalance in '.Call', 2 then 4
> NULL
>
> -- what happens is that this really corrupts the R call stack since the C++ exception mechanism bypassed R's call stack so R is now is an inconsistent state. The same can be invoked vice-versa (and is more common - using error in C++ will do it) but that's harder to show because you would have to track C++ allocations to see that you're leaking objects all over the place. That is also the reason why it's hard to find unless it's too late (and things may *appear* to work for some time while they are not).
>
>
> ----stl.cc:
>
> #include<Rinternals.h>
> #include<vector>
>
> using namespace std;
>
> extern "C" SEXP foo() {
> vector<int> a;
> a.resize(-1);
> return R_NilValue;
> }
>
> extern "C" SEXP bar() {
> try {
> // lots of other C++ code here ...
> eval(lang2(install(".Call"),mkString("foo")), R_GlobalEnv);
> } catch (...) {
> REprintf("something went wrong somewhere in C++...\n");
> }
> return R_NilValue;
> }
>
>> The fact that R has a C main may be problematic because C++ static
>> initializers may not be called properly, but the fact that packages are
>> usually loaded dynamically complicates this picture. The dynamic
>> library itself may take care of calling the static initializers (I'm not
>> sure about this, and this is probably OS-dependent). One possible
>> work-around would be to compile the first few lines (a stub) of
>> R main using the C++ compiler, leaving everything else as is
>> and compiled using the C compiler (at least until CXXR is widely
>> available).
>>
>> Since C++ (and STL) are very popular it would be helpful for developers
>> to have a better idea of the benefits and risks of using these tools
>> with R.
>>
>> Thanks,
>> Dominick
>>
>> On Fri, Apr 30, 2010 at 9:00 AM, Simon Urbanek
>> <simon.urbanek at r-project.org> wrote:
>>> Brian's answer was pretty exhaustive - just one more note that is indirectly related to memory management: C++ exception handling does interfere with R's error handling (and vice versa) so in general STL is very dangerous and best avoided in R. In addition, remember that regular local object rules are broken because you are not guaranteed to leave a function the regular way so there is a high danger of leaks and inconsistencies when using C++ memory management unless you specifically account for that. That said, I have written C++ code that works in R but you have to be very, very careful and think twice about using any complex C++ libraries since they are unlikely written in R-safe way.
>>>
>>> Cheers,
>>> Simon
>>>
>>>
>>> On Apr 30, 2010, at 1:03 AM, Dominick Samperi wrote:
>>>
>>>> The R docs say that there are two methods that the C programmer can
>>>> allocate memory, one where R automatically frees the memory on
>>>> return from .C/.Call, and the other where the user takes responsibility
>>>> for freeing the storage. Both methods involve using R-provided
>>>> functions.
>>>>
>>>> What happens when the user uses the standard "new" allocator?
>>>> What about when a C++ application uses STL and that library
>>>> allocates memory? In both of these cases the R-provided functions
>>>> are not used (to my knowledge), yet I have not seen any problems.
>>>>
>>>> How is the memory that R manages and garbage collects kept
>>>> separate from the memory that is allocated on the C++ side
>>>> quite independently of what R is doing?
>>>>
>>>> Thanks,
>>>> Dominick
--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://bit.ly/9aKDM9 : embed images in Rd documents
|- http://tr.im/OIXN : raster images and RImageJ
|- http://tr.im/OcQe : Rcpp 0.7.7
More information about the R-devel
mailing list