[Rd] Accuracy: Correct sums in rowSums(), colSums() (PR#6196)
Prof Brian Ripley
ripley at stats.ox.ac.uk
Tue Dec 30 09:30:21 MET 2003
Could you please prepare a patch against the current R-devel sources?
1.7.1 is several versions old.
I was puzzled as to why you are recommending improving the accuracy of the
`fast' methods and not the slow one, applying apply() to mean() and sum().
The improvement might be larger in those cases: what Goldberg calls
`a dramatic improvement' is from N eps down to 2 eps, and that is for the
worst case and is only dramatic if N is dramatically bigger than 2. It is
also not clear that it is effective on machines with extended-precision
registers unless N is very large.
Some of us are well aware of the issues of floating point arithmetic, but
R is normally used for statistical computations, where errors in the data
(rounding, measurement etc) normally are orders of magnitude bigger than
those in floating-point arithmetic. So implementation effort has been
spent on handling the statistical errors first.
On Tue, 30 Dec 2003 Nick.Efthymiou at schwab.com wrote:
> Full_Name: Nick Efthymiou
> Version: R1.5.0 and above
> OS: Red Hat Linux
> Submission from: (NULL) (162.93.14.73)
>
>
> With the introduction of the functions rowSums(), colSums(), rowMeans() and
> colMeans() in R1.5.0, function "SEXP do_colsum(SEXP call, SEXP op, SEXP args,
> SEXP rho)" was added to perform the fast summations. We have an excellent
> opportunity to improve the accuracy by implementing Kahan summation here.
>
> Kahan summation is described in
>
> @article{ goldberg91what,
> author = "David Goldberg",
> title = "What Every Computer Scientist Should Know About Floating-Point
> Arithmetic",
> journal = "ACM Computing Surveys",
> volume = "23",
> number = "1",
> pages = "5--48",
> year = "1991",
> url = "citeseer.nj.nec.com/goldberg91what.html" }
>
>
> (http://citeseer.nj.nec.com/goldberg91what.html)
>
> and in the article "Floating-point Summation", by Evan Manning, in C/C++ User's
> Journal, Sept. 1996.
>
> The proposed improvement has been tested in !IEEE_754 mode, the patch is
> attached.
> It is intended to be applied against R-1.7.1/src/main/array.c
>
> --------- Cut here ----------
> *** array.c.old Mon Dec 15 17:33:23 2003
> --- array.c Mon Dec 15 17:33:45 2003
> ***************
> *** 1016,1022 ****
> int OP, n, p, cnt = 0, i, j, type;
> Rboolean NaRm, keepNA;
> int *ix;
> ! double *rx, sum = 0.0;
>
> checkArity(op, args);
> x = CAR(args); args = CDR(args);
> --- 1016,1022 ----
> int OP, n, p, cnt = 0, i, j, type;
> Rboolean NaRm, keepNA;
> int *ix;
> ! double *rx, sum = 0.0, correction = 0.0;
>
> checkArity(op, args);
> x = CAR(args); args = CDR(args);
> ***************
> *** 1046,1062 ****
> switch (type) {
> case REALSXP:
> rx = REAL(x) + n*j;
> #ifdef IEEE_754
> if (keepNA)
> ! for (sum = 0., i = 0; i < n; i++) sum += *rx++;
> else {
> ! for (cnt = 0, sum = 0., i = 0; i < n; i++, rx++)
> ! if (!ISNAN(*rx)) {cnt++; sum += *rx;}
> else if (keepNA) {sum = NA_REAL; break;}
> }
> #else
> ! for (cnt = 0, sum = 0., i = 0; i < n; i++, rx++)
> ! if (!ISNAN(*rx)) {cnt++; sum += *rx;}
> else if (keepNA) {sum = NA_REAL; break;}
> #endif
> break;
> --- 1046,1082 ----
> switch (type) {
> case REALSXP:
> rx = REAL(x) + n*j;
> + sum = 0.;
> #ifdef IEEE_754
> if (keepNA)
> ! for (i = 0; i < n; i++) sum += *rx++;
> else {
> ! for (cnt = 0, i = 0; i < n; i++, rx++)
> ! if (!ISNAN(*rx)) {
> ! /* TODO: Kahan summation */
> ! cnt++; sum += *rx;
> ! }
> else if (keepNA) {sum = NA_REAL; break;}
> }
> #else
> ! i = cnt = 0;
> ! if (i < n) {
> ! if (ISNAN(*rx)) {
> ! if (keepNA) {sum = NA_REAL; break /* switch */;}
> ! } else {
> ! cnt++; sum = *rx;
> ! }
> ! i++; rx++;
> ! }
> ! for (; i < n; i++, rx++)
> ! if (!ISNAN(*rx)) {
> ! /* Kahan summation */
> ! double yhilo = *rx - correction;
> ! double tsum = sum + yhilo;
> ! correction = (tsum - sum) - yhilo;
> ! sum = tsum;
> ! cnt++;
> ! }
> else if (keepNA) {sum = NA_REAL; break;}
> #endif
> break;
> ***************
> *** 1121,1137 ****
> switch (type) {
> case REALSXP:
> rx = REAL(x) + i;
> #ifdef IEEE_754
> if (keepNA)
> ! for (sum = 0., j = 0; j < p; j++, rx += n) sum += *rx;
> else {
> ! for (cnt = 0, sum = 0., j = 0; j < p; j++, rx += n)
> ! if (!ISNAN(*rx)) {cnt++; sum += *rx;}
> else if (keepNA) {sum = NA_REAL; break;}
> }
> #else
> ! for (cnt = 0, sum = 0., j = 0; j < p; j++, rx += n)
> ! if (!ISNAN(*rx)) {cnt++; sum += *rx;}
> else if (keepNA) {sum = NA_REAL; break;}
> #endif
> break;
> --- 1141,1177 ----
> switch (type) {
> case REALSXP:
> rx = REAL(x) + i;
> + sum = 0.;
> #ifdef IEEE_754
> if (keepNA)
> ! for (j = 0; j < p; j++, rx += n) sum += *rx;
> else {
> ! for (cnt = 0, j = 0; j < p; j++, rx += n)
> ! if (!ISNAN(*rx)) {
> ! /* TODO: Kahan summation */
> ! cnt++; sum += *rx;
> ! }
> else if (keepNA) {sum = NA_REAL; break;}
> }
> #else
> ! j = cnt = 0;
> ! if (j < p) {
> ! if (ISNAN(*rx)) {
> ! if (keepNA) {sum = NA_REAL; break /* switch */;}
> ! } else {
> ! cnt++; sum = *rx;
> ! }
> ! j++; rx += n;
> ! }
> ! for (; j < p; j++, rx += n)
> ! if (!ISNAN(*rx)) {
> ! /* Kahan summation */
> ! double yhilo = *rx - correction;
> ! double tsum = sum + yhilo;
> ! correction = (tsum - sum) - yhilo;
> ! sum = tsum;
> ! cnt++;
> ! }
> else if (keepNA) {sum = NA_REAL; break;}
> #endif
> break;
> ---------end of patch -------
>
>
> Since someone is certainly wondering why I bothered to bring this up - I was
> trying to optimize the order of a Chebyshev approximation to a specific function
> so that I accomplish the most power-efficient calculation at one ulp accuracy
> (multiplications and additions consume battery power). The optimization was
> based on minimizing the relative error between the true function and the
> approximation. It involved a lot of sums and was not converging properly. With
> the patch, I got the convergence I needed.
>
> Thanks,
>
> - Nick -
>
> ______________________________________________
> R-devel at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-devel
>
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel
mailing list