[Rd] bug in sum() on integer vector

Fri Dec 9 20:39:57 CET 2011

On 09/12/2011 1:40 PM, Hervé Pagès wrote:
> Hi,
>
>     x<- c(rep(1800000003L, 10000000), -rep(1200000002L, 15000000))
>
> This is correct:
>
>     >  sum(as.double(x))
>     [1] 0
>
> This is not:
>
>     >  sum(x)
>     [1] 4996000
>
> Returning NA (with a warning) would also be acceptable for the latter.
> That would make it consistent with cumsum(x):
>
>     >  cumsum(x)[length(x)]
>     [1] NA
>     Warning message:
>     Integer overflow in 'cumsum'; use 'cumsum(as.numeric(.))'

This is a 64 bit problem; in 32 bits things work out properly.   I'd 
guess in 64 bit arithmetic we or the run-time are doing something to 
simulate 32 bit arithmetic (since integers are 32 bits), but it looks as 
though we're not quite getting it right.

Duncan Murdoch

> Thanks!
> H.
>
>   >  sessionInfo()
> R version 2.14.0 (2011-10-31)
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> locale:
>    [1] LC_CTYPE=en_CA.UTF-8       LC_NUMERIC=C
>    [3] LC_TIME=en_CA.UTF-8        LC_COLLATE=en_CA.UTF-8
>    [5] LC_MONETARY=en_CA.UTF-8    LC_MESSAGES=en_CA.UTF-8
>    [7] LC_PAPER=C                 LC_NAME=C
>    [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>