[R-SIG-Win] experimental toolchain: C++ exceptions + setjmp, longjmp
Kevin Ushey
kevinushey at gmail.com
Thu Sep 10 04:44:46 CEST 2015
One candidate here -- it looks like the way msvcrt's setjmp is called
has changed in the new toolchain. Previously, we had (in setjmp.h):
#ifdef _WIN64
#define setjmp(BUF) _setjmp((BUF), mingw_getsp())
#else
#define setjmp(BUF) _setjmp3((BUF), NULL)
#endif
whereas now we have:
# ifdef _WIN64
# if (__MINGW_GCC_VERSION < 40702)
# define setjmp(BUF) _setjmp((BUF), mingw_getsp())
# else
# define setjmp(BUF) _setjmp((BUF), __builtin_frame_address (0))
# endif
# else
# define setjmp(BUF) _setjmp3((BUF), NULL)
# endif
In other words, the previous toolchain attempted to get the stack
pointer with `mingw_getsp()`, while now it's done through
`__builtin_frame_address(0)`.
The definition of `mingw_getsp()` is here:
https://sourceforge.net/p/mingw-w64/mingw-w64/ci/9592360c86d5cc04d7a265fb11e4611f6df33a62/tree/mingw-w64-crt/misc/mingw_getsp.S
The definition for `__builtin_frame_address()` is here (assuming I am
reading the gcc sources properly):
https://github.com/gcc-mirror/gcc/blob/7aea4e7cdcd40d7bd47c64e76325a62191887d1b/gcc/builtins.c#L4528-L4574
And motivation for the change is here:
https://sourceforge.net/p/mingw-w64/mailman/mingw-w64-public/thread/CAEwic4bjMb851Y2T2dUr_ObBVYaiWY0XkgWnDodVUgjibWO-Sw%40mail.gmail.com/#msg29496874
FWIW, in my tests `mingw_getsp()` and `__builtin_frame_address()` do
return different values, so I think this is worth investigating a bit,
and hopefully someone with more expertise than I can weigh in.
Kevin
On Wed, Sep 9, 2015 at 1:25 PM, Kevin Ushey <kevinushey at gmail.com> wrote:
> Hi all,
>
> Using the version of R + toolchain built by Jeroen here:
>
> http://www.stat.ucla.edu/~jeroen/mingw-w64/
>
> I'm running into segfaults when attempting to run Rcpp unit tests with
> 64bit R. In particular, when an Rcpp module throws an exception, it is
> later caught, and 'forwarded to R' -- effectively, we have a construct
> of the form:
>
> SEXP r_condition = R_NilValue;
> try {
> // C++ code that might throw
> } catch (...) {
> // populate r_condition
> }
>
> // forward errors back to R
> if (r_condition != R_NilValue) {
> SEXP call = Rf_lang2(Rf_install("stop"), r_condition);
> Rf_eval(call, R_GlobalEnv); // will longjmp
> }
>
> The main goal is to ensure that any longjmps performed occur at the
> top level, with no C++ objects on the stack (and so no destructors are
> skipped). This allows us to safely transfer control back to R from a
> C++ context.
>
> The problem is, with the new toolchain, I'm seeing a crash on the
> attempted longjmp performed by R in the Rf_eval call.
>
> Moreover, though, I get an error even with a _local_ setjmp / longjmp
> after the exception is thrown and caught; ie, this fails:
>
> if (r_condition != R_NilValue) {
> jmp_buf buffer;
> if (!setjmp(buffer)) {
> longjmp(buffer, 1);
> }
> }
>
> This silly local setjmp / longjmp causes a crash, which I think
> implies 'something' has been corrupted 'somewhere' in the previous
> logic (unfortunately, I have no idea where)
>
> There is some extra prose in the pull request here that attempted to
> work around this: https://github.com/RcppCore/Rcpp/pull/371, but I
> think ultimately this needs to be narrowed down into a reproducible
> example. So far, the pieces seem to be:
>
> 1) A C++ function throws an exception,
> 2) That exception is caught,
> 3) We attempt to perform a longjmp after the exception is caught / handled.
>
> I still don't have any idea where fault lies (most likely within the
> toolchain, but could be within Rcpp or R itself) but if anyone is
> feeling 'brave' some help in generating a reproducible example would
> be very helpful. Note that such examples execute fine on other
> platforms, and with the old Windows toolchain, so in all likelihood
> it's a bug in exception handling with the new toolchain.
>
> It's also possible that the SJLJ exceptions used by the toolchain do
> not play well with the SEH exception model used internally by Windows
> DLLs (and, in fact, this _is_ where the crashes do later happen), but
> I think we want to stick with SJLJ for compatibility with pre-existing
> DLLs. This also seems surprising though, since the previous Windows
> toolchain used SJLJ exceptions (IIUC).
>
> NOTE: The previously established workaround, using
> `-fno-asynchronous-unwind-tables`, does not seem to save us from this
> bug.
>
> Kevin
>
> ---
>
> Rcpp's failing unit test code:
> https://github.com/RcppCore/Rcpp/blob/7d556daf295e279edea7990e7b7c19f58e1f9d8f/inst/unitTests/runit.Module.R#L69
>
> The C++ code actually executed:
> https://github.com/RcppCore/Rcpp/blob/7d556daf295e279edea7990e7b7c19f58e1f9d8f/inst/include/Rcpp/module/Module_Property.h#L36
>
> Where it's caught:
> https://github.com/RcppCore/Rcpp/blob/9491c727c1f53006f17cb4c932aecbaa4617757f/inst/include/Rcpp/macros/macros.h#L38-L56
>
> The callstack from WinDbg (NOTE: Some symbol names from R may be wrong)
>
> Child-SP RetAddr Call Site
> 00000000`044021a0 00000000`76ecf4ec ntdll!RtlRaiseStatus+0x18
> 00000000`04402740 000007fe`fdd6e543 ntdll!RtlIsDosDeviceName_U+0x15acc
> 00000000`04402de0 00000000`6c7c1d0e msvcrt!longjmp+0x63
> 00000000`04403320 00000000`6c7c0d13 R!Rf_error+0x2ee
> 00000000`04403580 00000000`6c7c4314 R!R_GetSrcFilename+0x513
> 00000000`04407800 00000000`6c7c4388 R!Rf_warningcall_immediate+0x1044
> 00000000`04407840 00000000`6c7d3f4e R!Rf_warningcall_immediate+0x10b8
> 00000000`04407870 00000000`6c7e4b31 R!do_Rprof+0x7c6e
> 00000000`044083f0 00000000`6c7e64e6 R!Rf_eval+0x161
> 00000000`04408660 00000000`6c7e4ca3 R!Rf_applyClosure+0x5d6
> 00000000`044088c0 00000000`6c54434f R!Rf_eval+0x2d3
> 00000000`04408b30 00000000`003fff90
> sourceCpp_1!ZN4Rcpp6class_I12ModuleNumberE11setPropertyEP7SEXPRECS4_S4_+0x4ef
> 00000000`04408b38 00000000`6c80529d 0x3fff90
> 00000000`04408b40 00000004`00000004 R!Rf_pmatch+0xa6d
> 00000000`04408c40 00000000`00000000 0x00000004`00000004
More information about the R-SIG-windows
mailing list