[Rd] [External] Re: Operations with long altrep vectors cause segfaults on Windows

iuke-tier@ey m@iii@g oii uiow@@edu iuke-tier@ey m@iii@g oii uiow@@edu
Tue Sep 8 16:32:25 CEST 2020


On Tue, 8 Sep 2020, Hugh Parsonage wrote:

> Thanks Martin.  On further testing, it seems that the segmentation
> fault can only occur when the amount of obtainable memory is
> sufficiently high. On my machine (admittedly with other processes
> running):
>
> $ R --vanilla --max-mem-size=30G -e "x <- c(0L, -2e9:2e9)"
> Segmentation fault
>
> $ R --vanilla --max-mem-size=29G -e "x <- c(0L, -2e9:2e9)"
> Error: cannot allocate vector of size 14.9 Gb
> Execution halted

Unfortunately I don't have access to a Windows machine with enough
memory to get to the point of failure. If you have rtools and gdb
installed can you run in gdb and see where the segfault is happening?

Best,

luke

>
> On Tue, 8 Sep 2020 at 18:52, Martin Maechler <maechler using stat.math.ethz.ch> wrote:
>>
>>>>>>> Martin Maechler
>>>>>>>     on Tue, 8 Sep 2020 10:40:24 +0200 writes:
>>
>>>>>>> Hugh Parsonage
>>>>>>>     on Tue, 8 Sep 2020 18:08:11 +1000 writes:
>>
>>    >> I can only reproduce on Windows, but reliably (both 4.0.0 and 4.0.2):
>>
>>    >> $> R --vanilla
>>    >> x <- c(0L, -2e9:2e9)
>>
>>    >> # > Segmentation fault
>>
>>    >> Tried to reproduce on Linux but the above worked as expected. Not an
>>    >> issue merely with the length of the vector; for example, x <-
>>    >> rep_len(1:10, 1e10) works, though the altrep vector must be long to
>>    >> reproduce:
>>
>>    >> x <- c(0L, -1e9:1e9)  #ok
>>
>>    >> Segmentation faults occur with the following too:
>>
>>    >> x <- (-2e9:2e9) + 1L
>>
>>    > Your operation would "need" (not in theory, but in practice)
>>    > to go from altrep to regular vectors.
>>    > I guess the segfault occurs because of something like this :
>>
>>    > R asks Windows to hand it a huge amount of memory and Windows replies
>>    > "ok, here is the memory pointer"
>>    > and then R tries to write to there, but illegally (because
>>    > Windows should have told R that it does not really have enough
>>    > memory for that ..).
>>
>>    > I cannot reproduce the segmentation fault .. but I can confirm
>>    > there is a bug there that shows for me on Windows but not on
>>    > Linux:
>>
>>    > "My" Windows is on a terminalserver not with too many GB of memory
>>    > (but then in a version of Windows that recognizes that it cannot
>>    > get so much memory):
>>
>>    > ------------------------- Here some transcript (thanks to
>>    > using Emacs w/ ESS also on Windows) ------------------
>>
>>    > R Under development (unstable) (2020-08-24 r79074) -- "Unsuffered Consequences"
>>    > Copyright (C) 2020 The R Foundation for Statistical Computing
>>    > Platform: x86_64-w64-mingw32/x64 (64-bit)
>>
>>    > R ist freie Software und kommt OHNE JEGLICHE GARANTIE.
>>    > Sie sind eingeladen, es unter bestimmten Bedingungen weiter zu verbreiten.
>>    > Tippen Sie 'license()' or 'licence()' für Details dazu.
>>
>>    > R ist ein Gemeinschaftsprojekt mit vielen Beitragenden.
>>    > Tippen Sie 'contributors()' für mehr Information und 'citation()',
>>    > um zu erfahren, wie R oder R packages in Publikationen zitiert werden können.
>>
>>    > Tippen Sie 'demo()' für einige Demos, 'help()' für on-line Hilfe, oder
>>    > 'help.start()' für eine HTML Browserschnittstelle zur Hilfe.
>>    > Tippen Sie 'q()', um R zu verlassen.
>>
>>    >> x <- (-2e9:2e9) + 1L
>>    > Fehler: kann Vektor der Größe 14.9 GB nicht allozieren
>>    >> y <- c(0L, -2e9:2e9)
>>    > Fehler: kann Vektor der Größe 14.9 GB nicht allozieren
>>    >> Sys.setenv(LANGUAGE="en")
>>    >> y <- c(0L, -2e9:2e9)
>>    > Error: cannot allocate vector of size 14.9 Gb
>>    >> y <- -1e9:4e9
>>    >> .Internal(inspect(y))
>>    > @0x00000000195a6808 14 REALSXP g0c0 [REF(65535)]  -1000000000 : -294967296 (compact)
>>    >> .Machine$integer.max / 1e9
>>    > [1] 2.147484
>>    >> y <- -1e6:2.2e9
>>    >> .Internal(inspect(y))
>>    > @0x000000000a11a5d8 14 REALSXP g0c0 [REF(65535)]  -1000000 : -2094967296 (compact)
>>    >> y <- -1e6:2e9
>>    >> .Internal(inspect(y))
>>    > @0x000000000a13adf0 13 INTSXP g0c0 [REF(65535)]  -1000000 : 2000000000 (compact)
>>    >>
>>    > ------------------------- end of transcript -----------------------------------
>>
>>    > So indeed, no seg.fault, R notices that it can't get 15 GB of
>>    > memory.
>>
>>    > But the bug is bad news:  We have *silent* integer overflow happening
>>    > according to what  .Internal(inspect(y)) shows...
>>
>>    > .... less bad new: Probably the bug is only in the 'internal inspect' code
>>    > where a format specifier is used in C's printf() that does not work
>>    > correctly on Windows, at least the way it is currently compiled ..
>>
>>
>>    > On (64-bit) Linux, I get
>>
>>    >> y <- -1e9:4e9 ; .Internal(inspect(y))
>>    > @7d86388 14 REALSXP g0c0 [REF(65535)]  -1000000000 : 4000000000 (compact)
>>
>>    >> y <- c(0L, y)
>>    > Error: cannot allocate vector of size 37.3 Gb
>>
>>    > which seems much better ... until I do find a bug, may again
>>    > only in the C code underlying .Internal(inspect(.)) :
>>
>>    >> y <- -1e9:2e9 ; .Internal(inspect(y))
>>    > @7d86ac0 13 INTSXP g0c0 [REF(65535)] Error: long vectors not supported yet: ../../../R/src/main/altclasses.c:139
>>    >>
>>
>> Indeed, the purported "integer overflow" (above) does not
>> happen.
>> It is "only" a  'printf' related bug inside .Internal(inspect(.)) on Windows.
>>
>> *interestingly*, the above bug I've noticed on (64-bit) Linux
>> does *not* show on Windows (64-bit), at least not for that case:
>>
>> On Windows, things are fine as long as they remain (compacted
>> aka 'ALTREP') INTSXP:
>>
>>  > y <- -1e3:2e9 ;.Internal(inspect(y))
>>   @0x000000000a285648 13 INTSXP g0c0 [REF(65535)]  -1000 : 2000000000 (compact)
>>  > y <- -1e3:2.1e9 ;.Internal(inspect(y))
>>   @0x0000000019925930 13 INTSXP g0c0 [REF(65535)]  -1000 : 2100000000 (compact)
>>
>> and here, y is correct, just the printing from
>> .Internal(inspect(y)) is bugous (probably prints the double as an integer):
>>
>>  > y <- -1e3:2.2e9 ; .Internal(inspect(y))
>>   @0x00000000195c0178 14 REALSXP g0c0 [REF(65535)]  -1000 : -2094967296 (compact)
>>  > length(y)
>>   [1] 2200001001
>>  > tail(y)
>>   [1] 2.2e+09 2.2e+09 2.2e+09 2.2e+09 2.2e+09 2.2e+09
>>  > tail(y) - 2.2e9
>>   [1] -5 -4 -3 -2 -1  0
>>  >
>>
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:   luke-tierney using uiowa.edu
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu


More information about the R-devel mailing list