[R-pkg-devel] Debian and Fedora clang segmentation faults
Ivan Krylov
|kry|ov @end|ng |rom d|@root@org
Mon May 27 22:37:18 CEST 2024
В Mon, 27 May 2024 13:29:56 -0500
Stephen Meyers <srmeyers2 using wisc.edu> пишет:
> I'm updating the 'astrochron' R package, and I'm trying to resolve a
> new segmentation fault that arises only with the Debian and Fedora
> clang compilers. An example is the function 'asm', which has been a
> component of astrochron since its debut July 2014:
>
> https://cran.r-project.org/web/checks/check_results_astrochron.html
This one is reproducible using containers or a virtual machine. Indeed,
the code crashes at the very beginning of the asm18_R subroutine:
> asm(freq=freq,target=target,fper=fper,rayleigh=rayleigh,nyquist=nyquist,sedmin=0.5,sedmax=3,
+ numsed=100,linLog=1,iter=100000,output=FALSE)
----- PERFORMING AVERAGE SPECTRAL MISFIT ANALYSIS -----
Program received signal SIGSEGV, Segmentation fault.
0x00007ff407f36774 in asm18_r_ ()
(gdb) disas
Dump of assembler code for function asm18_r_:
0x00007ff407f36760 <+0>: push %rbp
0x00007ff407f36761 <+1>: mov %rsp,%rbp
0x00007ff407f36764 <+4>: push %r15
0x00007ff407f36766 <+6>: push %r14
0x00007ff407f36768 <+8>: push %r13
0x00007ff407f3676a <+10>: push %r12
0x00007ff407f3676c <+12>: push %rbx
0x00007ff407f3676d <+13>: sub $0x17e42a78,%rsp
=> 0x00007ff407f36774 <+20>: mov %r9,-0x17e42838(%rbp)
0x00007ff407f3677b <+27>: mov %r8,-0x17e42830(%rbp)
0x00007ff407f36782 <+34>: mov %rcx,-0x17e42828(%rbp)
0x00007ff407f36789 <+41>: mov %rdx,-0x17e42820(%rbp)
0x00007ff407f36790 <+48>: mov %rsi,-0x17e42818(%rbp)
0x00007ff407f36797 <+55>: mov %rdi,-0x17e42810(%rbp)
flang-new-18 decided to subtract 400 megabytes from the stack pointer
right from the start, and never mind the fact that operating systems
treat the stack space like hundred-year-old brandy and the total stack
size limit is only 8 megabytes or so.
I think that the 400.8 megabytes come from the saveAsm(mxsr,mxdata)
array, which is mxsr=500 * mxdata=100000 * 8 bytes per real(8) in size,
and store(mxdata), which takes additional 800 kilobytes. When compiling
with warnings enabled, GFortran even produces a message about it:
>> Warning: Array ‘saveasm’ at (1) is larger than limit set by
>> ‘-fmax-stack-var-size=’, moved from stack to static storage. This
>> makes the procedure unsafe when called recursively, or concurrently
>> from multiple threads. Consider increasing the
>> ‘-fmax-stack-var-size=’ limit (or use ‘-frecursive’, which implies
>> unlimited ‘-fmax-stack-var-size’) - or change the code to use an
>> ALLOCATABLE array. If the variable is never accessed concurrently,
>> this warning can be ignored, and the variable could also be declared
>> with the SAVE attribute. [-Wsurprising]
The Fortran standard is silent about the stack vs heap vs static
storage issue, so flang-new is technically allowed to try to fit 400
megabytes of temporary storage on the stack [*].
Since asm18_R doesn't seem to be supposed to be reentrant, the fix is
to give the SAVE attribute to the two large variables, making the
Fortran processors prefer a different memory location for them:
implicit real(8) (a-h,o-z)
save saveAsm, store
(Untested because I accidentally deleted the container while preparing
the message.)
--
Best regards,
Ivan
[*] https://stat.ethz.ch/pipermail/r-package-devel/2023q4/010237.html
More information about the R-package-devel
mailing list