[Rd] NA_real_ <op> NaN -> NA or NaN, should we care?

Fri May 1 14:14:58 CEST 2009

>>>>> William Dunlap <wdunlap at tibco.com>
>>>>>     on Thu, 30 Apr 2009 10:51:43 -0700 writes:

    > On Linux when I compile R 2.10.0(devel) (src/main/arithmetic.c in
    > particular)
    > with gcc 3.4.5 using the flags -g -O2 I get noncommutative behavior when

is this really gcc 3.4.5  (which is quite old) ?

Without being an expert, I'd tend to claim this to be a
compiler (optimization) bug ....  but most probably the ANSI /
ISO  C (and libc ?) standards would not define the exact
behavior of arithmetic with NaNs.

    > adding NA and NaN:
    >> NA_real_ + NaN
    > [1] NaN
    >> NaN + NA_real_
    > [1] NA
    > If I compile src/main/arithmetic.c without optimization (just -g)
    > then both of those return NA.

    > On Windows, using a precompiled R 2.8.1 from CRAN I get
    > NA for both answers.

    > On Linux, after compiling src/main/arithmetic.c with -g -O2 the bit
    > patterns for NA_real_ and as.numeric(NA) are different:
    >> my_numeric_NA <- as.numeric(NA)
    >> writeBin(my_numeric_NA, ptmp<-pipe("od -x", open="wb"));close(ptmp)
    > 0000000 07a2 0000 0000 7ff8
    > 0000010
    >> writeBin(NA_real_, ptmp<-pipe("od -x", open="wb"));close(ptmp)
    > 0000000 07a2 0000 0000 7ff0
    > 0000010 
    > On Linux, after compiling with -g the bit patterns for NA_real_
    > and as.numeric(NA) are identical.
    >> my_numeric_NA <- as.numeric(NA)
    >> writeBin(my_numeric_NA, ptmp<-pipe("od -x", open="wb"));close(ptmp)
    > 0000000 07a2 0000 0000 7ff8
    > 0000010
    >> writeBin(NA_real_, ptmp<-pipe("od -x", open="wb"));close(ptmp)
    > 0000000 07a2 0000 0000 7ff8
    > 0000010

    > On Windows, using precompiled R 2.8.1 and cygwin/bin/od, both of those
    > gave the 7ff8 version.

    > Is this confounding of NA and NaN of concern or does R not promise to
    > keep NA and NaN distinct? 

Hmm, I'd say it *is* of some concern that "+" is not commutative
in the narrow sense, even if I don't know what exactly "R promises".

    > I haven't followed all the macros, but it looks like arithmetic.c just
    > does
    > result[i]=x[i]+y[i]
    > and lets the compiler/floating point unit decide what to do when x[i]
    > and y[i]
    > are different NaN values (NA is a NaN value).  I haven't looked at the C
    > code
    > for the initialization of NA_real_.  Adding explicit tests for NA-ness
    > in the
    > binary operators (as S+ does) adds a fairly significant cost.

Yes, I would be quite reluctant to add such
tests, because such costs are to be expected.

Maybe we ("R" :-) should explicitly state that operations mixing
NA & NaN give a result which is NA in the sense of fulfilling is.na(.) 
but *not* promise anything further.

Martin Maechler, ETH Zurich

    > Bill Dunlap
    > TIBCO Software Inc - Spotfire Division
    > wdunlap tibco.com