[Rd] R 4.0.2 64-bit Windows hangs

Tomas Kalibera tom@@@k@||ber@ @end|ng |rom gm@||@com
Sat Aug 22 20:26:11 CEST 2020


On 8/22/20 7:58 PM, Jeroen Ooms wrote:
> On Sat, Aug 22, 2020 at 8:39 AM Tomas Kalibera <tomas.kalibera using gmail.com> wrote:
>> On 8/21/20 11:45 PM, m19tdn+9alxwj7d2bmk--- via R-devel wrote:
>>> Ah yes, this is related. I reported v2010 below, but it looks like I was updated to this Insider Build overnight without my knowledge, and conflated it with the new installation R v4 this morning.
>>>
>>> I will continue to look into the issue with the methods Tomas mentioned.
>> It is interesting that a rare 5 years old problem would re-appear on
>> current Insider builds. Which build of Windows are you running exactly?
>> I've seen another report about a crash on 20190.1000. It'd be nice to
>> know if it is present also in newer builds, i.e. in 20197.
> I installed the latest 20197 build in a vm, and I can indeed reproduce
> this problem.
>
> What seems to be happening is that R triggers an infinite recursion in
> Windows unwinding mechanism, and eventually dies with a stack
> overflow. Attached a backtrace of the initial 100 frames of the main
> thread (the pattern in the top ~30 frames continues forever).
>
> The microsoft blog doesn't mention anything related to exception
> handling has changed in recent versions:
> https://docs.microsoft.com/en-us/windows-insider/at-home/active-dev-branch

Thanks, unfortunately that does not ring any bells (except below), I 
can't guess from this what is the underlying cause of the problem. There 
may be something wrong in how we use setjmp/longjmp or how 
setjmp/longjmp works on Windows.

It reminds me of a problem I've been debugging few days ago, when 
longjump implementation segfaults on Windows 10 (recent but not Insider 
build) probably soon after unwinding the stack, but only with GCC 10 / 
MinGW 7 and only in one of the no-segfault tests, and only with -03 (not 
-O2, not with with -O3 -fno-split-loops). The problem was sensitive to 
these optimization options interestingly on the call site of long jump 
(do_abs), even when it was not an immediate caller of the longjump. I've 
not tracked this down yet, it will require looking at the assembly 
level, and I was suspecting a compiler error causing the compiler to 
generate code that messes with the stack or registers in a way that 
impacts the upcoming jump. But now as we have this other problem with 
setjmp/logjmp, the compiler may not be the top suspect anymore.

I may not be able to work on this in the next few days or a week, so if 
anyone gets there first, please let me know what you find out.

Thanks,
Tomas



More information about the R-devel mailing list