[Rd] R 4.0.2 64-bit Windows hangs

Tomas Kalibera tom@@@k@||ber@ @end|ng |rom gm@||@com
Wed Aug 26 19:54:00 CEST 2020


On 8/25/20 6:14 PM, Tomas Kalibera wrote:
> On 8/22/20 9:33 PM, Jeroen Ooms wrote:
>> On Sat, Aug 22, 2020 at 9:10 PM Tomas Kalibera 
>> <tomas.kalibera using gmail.com> wrote:
>>> On 8/22/20 8:26 PM, Tomas Kalibera wrote:
>>>> On 8/22/20 7:58 PM, Jeroen Ooms wrote:
>>>>> On Sat, Aug 22, 2020 at 8:39 AM Tomas Kalibera
>>>>> <tomas.kalibera using gmail.com> wrote:
>>>>>> On 8/21/20 11:45 PM, m19tdn+9alxwj7d2bmk--- via R-devel wrote:
>>>>>>> Ah yes, this is related. I reported v2010 below, but it looks like
>>>>>>> I was updated to this Insider Build overnight without my knowledge,
>>>>>>> and conflated it with the new installation R v4 this morning.
>>>>>>>
>>>>>>> I will continue to look into the issue with the methods Tomas
>>>>>>> mentioned.
>>>>>> It is interesting that a rare 5 years old problem would re-appear on
>>>>>> current Insider builds. Which build of Windows are you running 
>>>>>> exactly?
>>>>>> I've seen another report about a crash on 20190.1000. It'd be 
>>>>>> nice to
>>>>>> know if it is present also in newer builds, i.e. in 20197.
>>>>> I installed the latest 20197 build in a vm, and I can indeed 
>>>>> reproduce
>>>>> this problem.
>>>>>
>>>>> What seems to be happening is that R triggers an infinite 
>>>>> recursion in
>>>>> Windows unwinding mechanism, and eventually dies with a stack
>>>>> overflow. Attached a backtrace of the initial 100 frames of the main
>>>>> thread (the pattern in the top ~30 frames continues forever).
>>>>>
>>>>> The microsoft blog doesn't mention anything related to exception
>>>>> handling has changed in recent versions:
>>>>> https://docs.microsoft.com/en-us/windows-insider/at-home/active-dev-branch 
>>>>>
>>>>>
>>>> Thanks, unfortunately that does not ring any bells (except below), I
>>>> can't guess from this what is the underlying cause of the problem.
>>>> There may be something wrong in how we use setjmp/longjmp or how
>>>> setjmp/longjmp works on Windows.
>>>>
>>>> It reminds me of a problem I've been debugging few days ago, when
>>>> longjump implementation segfaults on Windows 10 (recent but not
>>>> Insider build) probably soon after unwinding the stack, but only with
>>>> GCC 10 / MinGW 7 and only in one of the no-segfault tests, and only
>>>> with -03 (not -O2, not with with -O3 -fno-split-loops). The problem
>>>> was sensitive to these optimization options interestingly on the call
>>>> site of long jump (do_abs), even when it was not an immediate caller
>>>> of the longjump. I've not tracked this down yet, it will require
>>>> looking at the assembly level, and I was suspecting a compiler error
>>>> causing the compiler to generate code that messes with the stack or
>>>> registers in a way that impacts the upcoming jump. But now as we have
>>>> this other problem with setjmp/logjmp, the compiler may not be the top
>>>> suspect anymore.
>>>>
>>>> I may not be able to work on this in the next few days or a week, so
>>>> if anyone gets there first, please let me know what you find out.
>>> Btw could you please try out if the UCRT build of R crashes as well in
>>> the Insider Windows build ?
>> Yes, it hangs in exactly the same way, except that the backtrace shows
>>
>>   ucrtbase!.intrinsic_setjmpex () from C:\WINDOWS\System32\ucrtbase.dll
>>
>> Instead of msvcrt!_setjmpex (as expected of course).
>
> Thanks. I found what is causing the problem I observed with 
> GCC10/stock Windows 10, I expect this is the same one as in the 
> Insider build.
> I will investigate further,
>
> Tomas
>
It seems the problem is between MinGW-W64 and Windows, and really it 
causes both the reported crashes in an Insider build (I tested in 20197) 
and in my GCC 10 builds in a single "no-segfault" test. setjmp is 
implemented using Windows call _setjmpex, which has a second argument 
argument, which is set differently by MinGW based on GCC version. When I 
set this argument as MinGW-W64 did on early versions of GCC, 
mingw_getsp(), it fixes/hides the problems on my systems. Perl5 uses a 
similar workaround, but otherwise there is no solid base (documentation, 
specification, etc) I am aware of for this change, so this may take some 
more time to be properly fixed. Still, if anyone experiments with this 
workaround and finds a problem, please let me know. In particular, I am 
curious whether it works on earlier versions of Windows (at least with 
check-all, including recommended packages).

Thanks
Tomas





-------------- next part --------------
A non-text attachment was scrubbed...
Name: setjmp.diff
Type: text/x-patch
Size: 570 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-devel/attachments/20200826/a4ff18ba/attachment.bin>


More information about the R-devel mailing list