[Rd] What algorithm is R using to calculate mean?

Ravi Varadhan ravi.varadhan at jhu.edu
Fri Jul 26 14:58:28 CEST 2013


This uses the idea of Kahan's summation, if I am not mistaken.

http://en.wikipedia.org/wiki/Kahan_summation_algorithm 

Ravi
________________________________________
From: r-devel-bounces at r-project.org [r-devel-bounces at r-project.org] on behalf of Joshua Ulrich [josh.m.ulrich at gmail.com]
Sent: Friday, July 26, 2013 7:12 AM
To: Zach Harrington
Cc: r-devel at r-project.org List
Subject: Re: [Rd] What algorithm is R using to calculate mean?

This was also asked on StackOverflow:
http://stackoverflow.com/q/17866149/271616.  Here is the answer I
posted:

This appears to be the updating method of West, 1979 [1] and it was
implemented in R-2.3.0 in response to PR#1228 [2].

I'm not positive this is the correct algorithm, since it was suggested
by Martin Maechler but implemented by Brian Ripley. I couldn't find a
reference in the source code or version control logs that listed the
actual algorithm used. It was implemented in cov.c in revision 37389
and in summary.c in revision 37393.

[1] http://dl.acm.org/citation.cfm?doid=359146.359153
[2] https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=1228

Best,
--
Joshua Ulrich  |  about.me/joshuaulrich
FOSS Trading  |  www.fosstrading.com


On Thu, Jul 25, 2013 at 2:44 PM, Zach Harrington
<zach.harrington at gmail.com> wrote:
> I am curious to know what algorithm R's mean function uses. Is there some
> reference to the numerical properties of this algorithm?
>
> I found the following C code in summary.c:do_summary():
> case REALSXP:
>     PROTECT(ans = allocVector(REALSXP, 1));
>     for (i = 0; i < n; i++) s += REAL(x)[i];
>     s /= n;
>     if(R_FINITE((double)s)) {
>         for (i = 0; i < n; i++) t += (REAL(x)[i] - s);
>         s += t/n;
>     }
>     REAL(ans)[0] = s;
>     break;
>
> It seems to do a straight up mean:
> for (i = 0; i < n; i++) s += REAL(x)[i];
> s /= n;
>
> Then it adds what i assume is a numerical correction which seems to be the
> mean difference from the mean of the data:
> for (i = 0; i < n; i++) t += (REAL(x)[i] - s);
> s += t/n;
>
> I haven't been able to track this algorithm down anywhere (mean is not a
> great search term).
>
> Any help would be much appreciated,
>
> Zach Harrington
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list