[Rd] Discrepancy: R sum() VS C or Fortran sum
Pierre Chausse
pchausse at uwaterloo.ca
Fri Mar 16 18:08:07 CET 2018
My simple functions were to compare the result with the gfortran
compiler sum() function. I thought that the Fortran sum could not be
less precise than R. I was wrong. I am impressed. The R sum does in fact
match the result if we use the Kahan algorithm.
P.
I am glad to see that R sum() is more accurate than the gfortran
compiler sum.
On 16/03/18 11:37 AM, luke-tierney at uiowa.edu wrote:
> Install the gmp package, run your code, and then try this:
>
> bu <- gmp::as.bigq(u)
> bs4 <- bu[1] + bu[2] + bu[3] + bu[4] + bu[5]
> s4 <- as.double(bs4)
> s1 - s4
> ## [1] 0
> s2[[2]] - s4
> ## [1] 7.105427e-15
> s3 - s4
> ## [1] 7.105427e-15
> identical(s1, s4)
> ## [1] TRUE
>
> `bs4` is the exact sum of the binary rationals in your `u` vector;
> `s4` is the closest double precision to this exact sum.
>
> Looking at the C source code for sum() will show you that it makes
> some extra efforts to get a more accurate sum than your simple
> version.
>
> Best,
>
> luke
>
> On Fri, 16 Mar 2018, Pierre Chausse wrote:
>
>> Hi all,
>>
>> I found a discrepancy between the sum() in R and either a sum done in
>> C or Fortran for vector of just 5 elements. The difference is very
>> small, but this is a very small part of a much larger numerical
>> problem in which first and second derivatives are computed
>> numerically. This is part of a numerical method course I am teaching
>> in which I want to compare speeds of R versus Fortran (We solve a
>> general equilibrium problem all numerically, if you want to know).
>> Because of this discrepancy, the Jacobian and Hessian in R versus in
>> Fortran are quite different, which results in the Newton method
>> producing a different solution (for a given stopping rule). Since the
>> solution computed in Fortran is almost identical to the analytical
>> solution, I suspect that the sum in Fortran may be more accurate
>> (That's just a guess). Most of the time the sum produces identical
>> results, but for some numbers, it is different. The following
>> example, shows what happens:
>>
>> set.seed(12233)
>> n <- 5
>> a <- runif(n,1,5)
>> e <- runif(n, 5*(1:n),10*(1:n))
>> s <- runif(1, 1.2, 4)
>> p <- runif(5, 3, 10)
>> x <- c(e[-5], (sum(e*p)-sum(e[-5]*p[-5]))/p[5])
>> u <- a^(1/s)*x^((s-1)/s)
>> dyn.load("sumF.so")
>>
>> u[1] <- u[1]+.0001 ### If we do not add .0001, all differences are 0
>> s1 <- sum(u)
>> s2 <- .Fortran("sumf", as.double(u), as.integer(n), sf1=double(1),
>> sf2=double(1))[3:4]
>> s3 <- .C("sumc", as.double(u), as.integer(n), sC=double(1))[[3]]
>>
>> s1-s2[[1]] ## R versus compiler sum() (Fortran)
>>
>> [1] -7.105427e-15
>>
>> s1-s2[[2]] ## R versus manual sum (Fortran
>>
>> [1] -7.105427e-15
>>
>> s1-s3 ## R Versus manual sum in C
>>
>> [1] -7.105427e-15
>>
>> s2[[2]]-s2[[1]] ## manual sum versus compiler sum() (Fortran)
>>
>> [1] 0
>>
>> s3-s2[[2]] ## Fortran versus C
>>
>> [1] 0
>>
>> My sumf and sumc are
>>
>> subroutine sumf(x, n, sx1, sx2)
>> integer i, n
>> double precision x(n), sx1, sx2
>> sx1 = sum(x)
>> sx2 = 0.0d0
>> do i=1,n
>> sx2 = sx2+x(i)
>> end do
>> end
>>
>> void sumc(double *x, int *n, double *sum)
>> {
>> int i;
>> double sum1 = 0.0;
>> for (i=0; i< *n; i++) {
>> sum1 += x[i];
>> }
>> *sum = sum1;
>> }
>>
>> Can that be a bug? Thanks.
>>
>>
>
--
Pierre Chaussé
Assistant Professor
Department of Economics
University of Waterloo
More information about the R-devel
mailing list