[Rd] Operations with long altrep vectors cause segfaults on Windows
Martin Maechler
m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Tue Sep 8 10:40:24 CEST 2020
>>>>> Hugh Parsonage
>>>>> on Tue, 8 Sep 2020 18:08:11 +1000 writes:
> I can only reproduce on Windows, but reliably (both 4.0.0 and 4.0.2):
> $> R --vanilla
> x <- c(0L, -2e9:2e9)
> # > Segmentation fault
> Tried to reproduce on Linux but the above worked as expected. Not an
> issue merely with the length of the vector; for example, x <-
> rep_len(1:10, 1e10) works, though the altrep vector must be long to
> reproduce:
> x <- c(0L, -1e9:1e9) #ok
> Segmentation faults occur with the following too:
> x <- (-2e9:2e9) + 1L
Your operation would "need" (not in theory, but in practice)
to go from altrep to regular vectors.
I guess the segfault occurs because of something like this :
R asks Windows to hand it a huge amount of memory and Windows replies
"ok, here is the memory pointer"
and then R tries to write to there, but illegally (because
Windows should have told R that it does not really have enough
memory for that ..).
I cannot reproduce the segmentation fault .. but I can confirm
there is a bug there that shows for me on Windows but not on
Linux:
"My" Windows is on a terminalserver not with too many GB of memory
(but then in a version of Windows that recognizes that it cannot
get so much memory):
------------------------- Here some transcript (thanks to
using Emacs w/ ESS also on Windows) ------------------
R Under development (unstable) (2020-08-24 r79074) -- "Unsuffered Consequences"
Copyright (C) 2020 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)
R ist freie Software und kommt OHNE JEGLICHE GARANTIE.
Sie sind eingeladen, es unter bestimmten Bedingungen weiter zu verbreiten.
Tippen Sie 'license()' or 'licence()' für Details dazu.
R ist ein Gemeinschaftsprojekt mit vielen Beitragenden.
Tippen Sie 'contributors()' für mehr Information und 'citation()',
um zu erfahren, wie R oder R packages in Publikationen zitiert werden können.
Tippen Sie 'demo()' für einige Demos, 'help()' für on-line Hilfe, oder
'help.start()' für eine HTML Browserschnittstelle zur Hilfe.
Tippen Sie 'q()', um R zu verlassen.
> x <- (-2e9:2e9) + 1L
Fehler: kann Vektor der Größe 14.9 GB nicht allozieren
> y <- c(0L, -2e9:2e9)
Fehler: kann Vektor der Größe 14.9 GB nicht allozieren
> Sys.setenv(LANGUAGE="en")
> y <- c(0L, -2e9:2e9)
Error: cannot allocate vector of size 14.9 Gb
> y <- -1e9:4e9
> .Internal(inspect(y))
@0x00000000195a6808 14 REALSXP g0c0 [REF(65535)] -1000000000 : -294967296 (compact)
> .Machine$integer.max / 1e9
[1] 2.147484
> y <- -1e6:2.2e9
> .Internal(inspect(y))
@0x000000000a11a5d8 14 REALSXP g0c0 [REF(65535)] -1000000 : -2094967296 (compact)
> y <- -1e6:2e9
> .Internal(inspect(y))
@0x000000000a13adf0 13 INTSXP g0c0 [REF(65535)] -1000000 : 2000000000 (compact)
>
------------------------- end of transcript -----------------------------------
So indeed, no seg.fault, R notices that it can't get 15 GB of
memory.
But the bug is bad news: We have *silent* integer overflow happening
according to what .Internal(inspect(y)) shows...
.... less bad new: Probably the bug is only in the 'internal inspect' code
where a format specifier is used in C's printf() that does not work
correctly on Windows, at least the way it is currently compiled ..
On (64-bit) Linux, I get
> y <- -1e9:4e9 ; .Internal(inspect(y))
@7d86388 14 REALSXP g0c0 [REF(65535)] -1000000000 : 4000000000 (compact)
> y <- c(0L, y)
Error: cannot allocate vector of size 37.3 Gb
which seems much better ... until I do find a bug, may again
only in the C code underlying .Internal(inspect(.)) :
> y <- -1e9:2e9 ; .Internal(inspect(y))
@7d86ac0 13 INTSXP g0c0 [REF(65535)] Error: long vectors not supported yet: ../../../R/src/main/altclasses.c:139
>
More information about the R-devel
mailing list