[R-pkg-devel] RFC: C backtraces for R CMD check via just-in-time debugging

Vladimir Dergachev vo|ody@ @end|ng |rom m|nd@pr|ng@com
Thu Mar 7 15:38:18 CET 2024


Hi Ivan,

Here is the piece of code I currently use:

void backtrace_dump(void)
{
     unw_cursor_t    cursor;
     unw_context_t   context;

     unw_getcontext(&context);
     unw_init_local(&cursor, &context);

     while (unw_step(&cursor) > 0)
     {
         unw_word_t  offset, pc;
         char        fname[64];

         unw_get_reg(&cursor, UNW_REG_IP, &pc);

         fname[0] = '\0';
         (void) unw_get_proc_name(&cursor, fname, 64, &offset);

         fprintf(stderr, "0x%016lx : (%s+0x%lx)\n", pc-(long)backtrace_dump, fname, offset);
     }
}

To make it safe, one can simply replace fprintf() with a function that 
stores information into a buffer.

Several things to point out:

   * printing pc-(long)backtrace_dump works around address randomization, 
so that if you attach the debugger you can find the location again by 
using backtrace_dump+0xxxx (it does not have to be backtrace_dump, any 
symbol will do)

   * this works even if the symbols are stripped, in which case it finds an 
offset relative to the nearest available symbol - there are always some 
from the loader. Of course, in this case you should use the offsets and 
the debugger to find out whats wrong

   * you can call backtrace_dump() from anywhere, does not have to be a 
signal handler. I've taken to calling it when my programs detect some 
abnormal situation, so I can see the call chain.

   * this should work as a package, but I am not sure whether the offsets 
between package symbols and R symbols would be static or not. For R it 
might be a good idea to also print a table of offsets between some R 
symbol and all the loaded C packages R_init_RMVL(), at least initially.

   * R ought to know where packages are loaded, we might want to be clever 
and print out information on which package contains which function, or 
there might be identical R_init_RMVL() printouts.

best

Vladimir Dergachev

On Thu, 7 Mar 2024, Ivan Krylov wrote:

> On Tue, 5 Mar 2024 18:26:28 -0500 (EST)
> Vladimir Dergachev <volodya using mindspring.com> wrote:
>
>> I use libunwind in my programs, works quite well, and simple to use.
>>
>> Happy to share the code if there is interest..
>
> Do you mean that you use libunwind in signal handlers? An example on
> how to produce a backtrace without calling any async-signal-unsafe
> functions would indeed be greatly useful.
>
> Speaking of shared objects injected using LD_PRELOAD, I've experimented
> some more, and I think that none of them would work with R without
> additional adjustments. They install their signal handler very soon
> after the process starts up, and later, when R initialises, it
> installs its own signal handler, overwriting the previous one. For this
> scheme to work, either R would have to cooperate, remembering a pointer
> to the previous signal handler and calling it at some point (which
> sounds unsafe), or the injected shared object would have to override
> sigaction() and call R's signal handler from its own (which sounds
> extremely unsafe).
>
> Without that, if we want C-level backtraces, we either need to patch R
> to produce them (using backtrace() and limiting this to glibc systems
> or using libunwind and paying the dependency cost) or to use a debugger.
>
> -- 
> Best regards,
> Ivan
>



More information about the R-package-devel mailing list