[R] Best way to compute a sum
David Winsemius
dwinsemius at comcast.net
Thu Jun 24 22:49:35 CEST 2010
On Jun 24, 2010, at 4:08 PM, Lasse Kliemann wrote:
>> a <- 0 ; for(i in (1:200000000)) a <- a + 1/i
>> b <- 0 ; for(i in (200000000:1)) b <- b + 1/i
>> c <- sum(1/(1:200000000))
>> d <- sum(1/(200000000:1))
>> order(c(a,b,c,d))
> [1] 1 2 4 3
>> b<c
> [1] TRUE
>> c==d
> [1] FALSE
>
> I'd expected b being the largest, since we sum up the smallest
> numbers first.
Can you explain? I would have assumed the floating point algorithms
would incorporate some sort of sensible rounding strategy that would
prevent consistent underflow.
> Instead, c is the largest, which is sum() applied
> to the vector ordered with largest numbers first.
>
> Can anyone shed some light on this?
> sum(1/(200000000:1))
[1] 19.69104
> sum(1/(1:200000000))
[1] 19.69104
> b <- 0 ; for(i in (200000000:1)) b <- b + 1/i
> b
[1] 19.69104
> sum(1/(1:200000000)) - sum(1/(200000000:1))
[1] 7.105427e-15
> sum(1/(200000000:1)) -b
[1] 1.673328e-12
> sum(1/(1:200000000)) -b
[1] 1.680434e-12
> .Machine$double.eps
[1] 2.220446e-16
> .Machine$double.eps ^ 0.5
[1] 1.490116e-08
>
> What is the best way in R to compute a sum while avoiding
> cancellation effects?
>
> By the way, sum() in the above example is much faster than the
> loops, so it would be nice if we could utilize it.
Why wouldn't you? Generally differences as small as you are observing
are either meaningless or unavoidable using double precision
arithmetic. If you need exact arithmetic, then pick an appropriate
platform.
--
David.
More information about the R-help
mailing list