[Rd] mean(x) != mean(rev(x)) different with x <- c(NA, NaN) for some builds

Henrik Bengtsson henrik.bengtsson at gmail.com
Sat Apr 1 08:52:19 CEST 2017


Although help("is.nan") says:

   "Computations involving NaN will return NaN or perhaps NA: ..."

it might not be obvious that this is also why one may get:

> mean(c(-Inf, +Inf, NA))
[1] NaN

> mean(c(-Inf, NA, +Inf))
[1] NA

This is because internally the intermediate sum +Inf + -Inf is NaN in
the first case.

May I propose the following patch to that help paragraph:

Index: src/library/base/man/is.finite.Rd
===================================================================
--- src/library/base/man/is.finite.Rd (revision 72462)
+++ src/library/base/man/is.finite.Rd (working copy)
@@ -78,6 +78,8 @@
   Computations involving \code{NaN} will return \code{NaN} or perhaps
   \code{\link{NA}}: which of those two is not guaranteed and may depend
   on the \R platform (since compilers may re-order computations).
+  This may also apply to computations involving both \code{-Inf} and
+  \code{+Inf} in cases where they produce an intermediate \code{NaN}.
 }
 \value{
   A logical vector of the same length as \code{x}: \code{dim},

/Henrik


On Fri, Mar 31, 2017 at 10:51 PM, Henrik Bengtsson
<henrik.bengtsson at gmail.com> wrote:
> On Fri, Mar 31, 2017 at 10:14 PM, Prof Brian Ripley
> <ripley at stats.ox.ac.uk> wrote:
>> From ?NA
>>
>>      Numerical computations using ‘NA’ will normally result in ‘NA’: a
>>      possible exception is where ‘NaN’ is also involved, in which case
>>      either might result.
>>
>> and ?NaN
>>
>>      Computations involving ‘NaN’ will return ‘NaN’ or perhaps ‘NA’:
>>      which of those two is not guaranteed and may depend on the R
>>      platform (since compilers may re-order computations).
>>
>> fortunes::fortune(14) applies (yet again).
>
> Thanks; I'm often happy to have contributed to some of the fortune
> counters, but not so sure about this one.   What's even worse is that
> one of my own matrixStats NEWS has an entry go a few years back which
> mentions "... incorrectly assumed that the value of prod(c(NaN, NA))
> is uniquely defined.  However, as documented in help("is.nan"), it may
> be NA or NaN depending on R system/platform."  I guess the joke is on
> me - it's April 1st after all.
>
> But, technically one could test for ISNA(x) for each element before
> calculating the intermediate sum, but since that is a quite expensive
> test it is not done and sum += x is performed "as is" on NA and NaN
> (and -Inf and +Inf).  Is that correct?
>
> /Henrik
>
>>
>>
>> On 01/04/2017 04:50, Henrik Bengtsson wrote:
>>>
>>> In R 3.3.3, I observe the following on Ubuntu 16.04 (when building
>>> from source as well as for the sudo apt r-base build):
>>>
>>>> x <- c(NA, NaN)
>>>> mean(x)
>>>
>>> [1] NA
>>>>
>>>> mean(rev(x))
>>>
>>> [1] NaN
>>>
>>>> rowMeans(matrix(x, nrow = 1, ncol = 2))
>>>
>>> [1] NA
>>>>
>>>> rowMeans(matrix(rev(x), nrow = 1, ncol = 2))
>>>
>>> [1] NaN
>>>
>>>> .rowMeans(x, m = 1, n = 2)
>>>
>>> [1] NA
>>>>
>>>> .rowMeans(rev(x), m = 1, n = 2)
>>>
>>> [1] NaN
>>>
>>>> .rowSums(x, m = 1, n = 2)
>>>
>>> [1] NA
>>>>
>>>> .rowSums(rev(x), m = 1, n = 2)
>>>
>>> [1] NaN
>>>
>>>> rowSums(matrix(x, nrow = 1, ncol = 2))
>>>
>>> [1] NA
>>>>
>>>> rowSums(matrix(rev(x), nrow = 1, ncol = 2))
>>>
>>> [1] NaN
>>>
>>> I'd expect NA to trump NaN in all cases (with na.rm = FALSE).  sum()
>>> does not have this problem and returns NA in both cases (*).
>>>
>>> For the same R version build from source on RHEL 6.6 system
>>> (completely different architecture), I get the expected result (= NA)
>>> for all of the above cases, e.g.
>>>
>>>> x <- c(NA, NaN)
>>>> mean(x)
>>>
>>> [1] NA
>>>>
>>>> mean(rev(x))
>>>
>>> [1] NA
>>> [...]
>>>
>>> Before going insane trying to troubleshoot this, I have a vague memory
>>> that this, or something related to this, has been discussed
>>> previously, but I cannot locate it.
>>>
>>> Is the above a bug in R, a FAQ, a build error, overzealous compiler
>>> optimization, and / or ...?
>>>
>>> Thanks,
>>>
>>> Henrik
>>
>>
>>
>> --
>> Brian D. Ripley,                  ripley at stats.ox.ac.uk
>> Emeritus Professor of Applied Statistics, University of Oxford
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list