[Rd] NA_real_ <op> NaN -> NA or NaN, should we care?
Martin Maechler
maechler at stat.math.ethz.ch
Fri May 1 15:14:26 CEST 2009
>>>>> "MM" == Martin Maechler <maechler at stat.math.ethz.ch>
>>>>> on Fri, 1 May 2009 14:14:58 +0200 writes:
>>>>> William Dunlap <wdunlap at tibco.com>
>>>>> on Thu, 30 Apr 2009 10:51:43 -0700 writes:
>> On Linux when I compile R 2.10.0(devel) (src/main/arithmetic.c in
>> particular)
>> with gcc 3.4.5 using the flags -g -O2 I get noncommutative behavior when
MM> is this really gcc 3.4.5 (which is quite old) ?
MM> Without being an expert, I'd tend to claim this to be a
MM> compiler (optimization) bug .... but most probably the ANSI /
MM> ISO C (and libc ?) standards would not define the exact
MM> behavior of arithmetic with NaNs.
>> adding NA and NaN:
>>> NA_real_ + NaN
>> [1] NaN
>>> NaN + NA_real_
>> [1] NA
>> If I compile src/main/arithmetic.c without optimization (just -g)
>> then both of those return NA.
>> On Windows, using a precompiled R 2.8.1 from CRAN I get
>> NA for both answers.
>> On Linux, after compiling src/main/arithmetic.c with -g -O2 the bit
>> patterns for NA_real_ and as.numeric(NA) are different:
>>> my_numeric_NA <- as.numeric(NA)
>>> writeBin(my_numeric_NA, ptmp<-pipe("od -x", open="wb"));close(ptmp)
>> 0000000 07a2 0000 0000 7ff8
>> 0000010
>>> writeBin(NA_real_, ptmp<-pipe("od -x", open="wb"));close(ptmp)
>> 0000000 07a2 0000 0000 7ff0
>> 0000010
>> On Linux, after compiling with -g the bit patterns for NA_real_
>> and as.numeric(NA) are identical.
>>> my_numeric_NA <- as.numeric(NA)
>>> writeBin(my_numeric_NA, ptmp<-pipe("od -x", open="wb"));close(ptmp)
>> 0000000 07a2 0000 0000 7ff8
>> 0000010
>>> writeBin(NA_real_, ptmp<-pipe("od -x", open="wb"));close(ptmp)
>> 0000000 07a2 0000 0000 7ff8
>> 0000010
>> On Windows, using precompiled R 2.8.1 and cygwin/bin/od, both of those
>> gave the 7ff8 version.
I've run a couple of
echo 'c(NA+NaN, NaN+NA)' | R --slave --vanilla
on 3 different platforms, all Linux, and many installed versions
of R -- all of which compiled with defaults, i.e. '-O2',
and depending on its release date, compiled with
different versions of gcc {I can still check these
because I run R from the installation directory, with
all configure results ..}
on a given platform, and have observed the following:
The result is always "NA NA" (as desired!) iff the platform is 32-bit,
whereas it is "NaN NA" for the 64-bit platform and older
versions of R (R <= 2.3.1)
and "NA NaN" for versions of R (>= 2.4.1)
{{but note: the versions of R are confounded with the versions of gcc ...}}
Further, as the 64-bit platform can also use the 32-bit versions of R,
I've tested a few and found that indeed, the 32-bit version of R
would return "NA NA" also on the 64-bit platform.
Martin
>> Is this confounding of NA and NaN of concern or does R not promise to
>> keep NA and NaN distinct?
MM> Hmm, I'd say it *is* of some concern that "+" is not commutative
MM> in the narrow sense, even if I don't know what exactly "R promises".
>> I haven't followed all the macros, but it looks like arithmetic.c just
>> does
>> result[i]=x[i]+y[i]
>> and lets the compiler/floating point unit decide what to do when x[i]
>> and y[i]
>> are different NaN values (NA is a NaN value). I haven't looked at the C
>> code
>> for the initialization of NA_real_. Adding explicit tests for NA-ness
>> in the
>> binary operators (as S+ does) adds a fairly significant cost.
MM> Yes, I would be quite reluctant to add such
MM> tests, because such costs are to be expected.
MM> Maybe we ("R" :-) should explicitly state that operations mixing
MM> NA & NaN give a result which is NA in the sense of fulfilling is.na(.)
MM> but *not* promise anything further.
MM> Martin Maechler, ETH Zurich
>> Bill Dunlap
>> TIBCO Software Inc - Spotfire Division
>> wdunlap tibco.com
MM> ______________________________________________
MM> R-devel at r-project.org mailing list
MM> https://stat.ethz.ch/mailman/listinfo/r-devel
More information about the R-devel
mailing list