[R] Best way to compute a sum
Duncan Murdoch
murdoch.duncan at gmail.com
Thu Jun 24 22:26:51 CEST 2010
On 24/06/2010 4:08 PM, Lasse Kliemann wrote:
> > a <- 0 ; for(i in (1:200000000)) a <- a + 1/i
> > b <- 0 ; for(i in (200000000:1)) b <- b + 1/i
> > c <- sum(1/(1:200000000))
> > d <- sum(1/(200000000:1))
> > order(c(a,b,c,d))
> [1] 1 2 4 3
> > b<c
> [1] TRUE
> > c==d
> [1] FALSE
>
> I'd expected b being the largest, since we sum up the smallest
> numbers first. Instead, c is the largest, which is sum() applied
> to the vector ordered with largest numbers first.
>
> Can anyone shed some light on this?
>
I don't know why you'd expect b to be larger than the others. When
you're using sum(), you should expect the most accurate answer, but it
won't necessarily be the biggest: some rounding errors will make the
answer bigger than it should be. For example, if we only had whole
number arithmetic, 0.6 + 0.6 + 0.6 might come out to 3, if each value
was rounded to 1 before adding.
As to whether c or d comes out better: you'd need to program it
yourself if you care about getting the last bit right.
> What is the best way in R to compute a sum while avoiding
> cancellation effects?
>
Use sum(). If it's not good enough, then do it in C, accumulating in
extended precision (which is what sum() does), from smallest to largest
if all terms are positive. If there are mixed signs it's a lot harder
to optimize, but I believe the optimal order would be something like the
one that keeps each partial sum as close as possible to zero. It would
rarely be practical to implement that, so summing from smallest absolute
value to largest would be my recommendation.
> By the way, sum() in the above example is much faster than the
> loops, so it would be nice if we could utilize it.
>
Why do you think you can't?
Duncan Murdoch
> ------------------------------------------------------------------------
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list