[Rd] bug in sum() on integer vector

John C Nash nashjc at uottawa.ca
Wed Dec 14 16:19:58 CET 2011


Following this thread, I wondered why nobody tried cumsum to see where the integer
overflow occurs. On the shorter xx vector in the little script below I get a message:

Warning message:
Integer overflow in 'cumsum'; use 'cumsum(as.numeric(.))'
>

But sum() does not give such a warning, which I believe is the point of contention. Since
cumsum() does manage to give such a warning, and show where the overflow occurs, should
sum() not be able to do so? For the record, I don't class the non-zero answer as an error
in itself. I regard the failure to warn as the issue.

For info, on my Ubnuntu Lucid 10.04 system that has 4 GB of RAM but no swap, the last line
of the script to do the int64 sum chugs for about 2 minutes then gives "Killed" and
returns to the terminal prompt. It also seems to render some other applications unstable
(I had Thunderbird running to read R-devel, and this started to behave strangely after the
crash, and I had to reboot.) I'm copying Romain as package maintainer, and I'll be happy
to try to work off-list to figure out how to avoid the "Killed" result. (On a 16GB
machine, I got the 0 answer.)

Best,

John Nash

Here's the system info and small script.

>> sessionInfo()
> R version 2.14.0 (2011-10-31)
> Platform: x86_64-pc-linux-gnu (64-bit)
>
> locale:
>  [1] LC_CTYPE=en_US.utf8       LC_NUMERIC=C
>  [3] LC_TIME=en_US.utf8        LC_COLLATE=en_US.utf8
>  [5] LC_MONETARY=en_US.utf8    LC_MESSAGES=en_US.utf8
>  [7] LC_PAPER=C                LC_NAME=C
>  [9] LC_ADDRESS=C              LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] int64_1.1.2
>>


## sumerr.R  20111214
library(int64)
x <- c(rep(1800000003L, 10000000), -rep(1200000002L, 15000000))
xx <- c(rep(1800000003L, 1000), -rep(1200000002L, 1500))
sum(x)
sum(as.double(x))
sum(xx)
sum(as.double(xx))
cumsum(xx)
cumsum(as.int64(xx))

tmp<-readline("Now try the VERY SLOW int64")
sum(as.int64(x))



More information about the R-devel mailing list