[Rd] identical(0, -0)

Martin Maechler maechler at stat.math.ethz.ch
Tue Aug 25 13:38:59 CEST 2009


>>>>> "HenrikB" == Henrik Bengtsson <hb at stat.berkeley.edu>
>>>>>     on Sat, 22 Aug 2009 08:34:51 -0700 writes:

    HenrikB> On Sat, Aug 22, 2009 at 1:22 AM, Petr Savicky<savicky at cs.cas.cz> wrote:
    >> On Sat, Aug 22, 2009 at 12:00:44AM +0200, Martin Maechler wrote:
    >>> I have taken up the issue now,
    >>> and after thinking, studying the source, trying to define a
    >>> 'method = <string>' argument, came to the conclusion that both
    >>> the implementation and documentation (and source code "self-explanation")
    >>> are easiest to program, maintain, and understand,
    >>> if I introduce explicit binary switches,
    >>> so I now  propose the following R-level interface which keeps
    >>> the current behavior the default:
    >>> 
    >>> >> Usage:
    >>> >>
    >>> >>      identical(x, y, num.EQ = TRUE, one.NA = TRUE, attrib.asSet = TRUE)
    >>> >>
    >>> >> Arguments:
    >>> >>
    >>> >>     x, y: any R objects.
    >>> >>
    >>> >>   num.EQ: logical indicating if ('double' and 'complex' non-'NA')
    >>> >>           numbers should be compared using '==', or by bitwise
    >>> >>           comparison.  The latter (non-default) differentiates between
    >>> >>           '-0' and '+0'.
    >>> >>
    >>> >>   one.NA: logical indicating if there is conceptually just one numeric
    >>> >>           'NA' and one 'NaN';  'one.NA = FALSE' differentiates bit
    >>> >>           patterns.
    >>> >>
    >>> >> attrib.asSet: logical indicating if 'attributes' of 'x' and 'y' should
    >>> >>           be treated as _unordered_ tagged pairlists ("sets"); this
    >>> >>           currently also applies to 'slot's of S4 objects.  It may well
    >>> >>           be too strict to set 'attrib.asSet = FALSE'.

    HenrikB> My only comment is to make the argument notation a bit more consistent:

    HenrikB> (num.Eq, one.NA, attrib.as.set)

    HenrikB> or

    HenrikB> (numEq, oneNA, attribAsSet)

thank you.  I think I'd prefer the (older style) with "."
{and I had only one "." in all options},
and yes, I agree that these are a bit more consistent.

    HenrikB> Also, maybe "single" instead of "one".

yeaah.. that's possibly better...  
Other votes (on this part)?


    >> I appreciate having several binary switches. Besides the arguments above,
    >> this is also useful for an interactive use of identical(), for example,
    >> for debugging purposes. If there is a difference between objects, then
    >> the switches allow to get more information concerning what is the type
    >> of the difference.

exactly, thanks..

    >>> I'm open for better names of arguments, but will not accept "_"
    >>> in the argument names {just my taste; no reason for argueing...}.
    >> 
    >> I would slightly prefere one.NaN instead of one.NA. 

    >> In IEEE 754 terminology, R's 'NA's are a subset of
    >> 'NaN's. So. NaN is a bit more general notion, although in
    >> R, the sets of 'NA's an 'NaN's are disjoint. Moreover,
    >> the name one.NaN specifies more clearly, that the issue
    >> is important only for numeric types and not, for example,
    >> for integer.
    >> 
    >> Petr.

You are right of course  about  IEEE NaN's,
and also the fact that there *are* non-numeric NAs in R.

However, in the R world,  'NA' is much more known to users,
and much more importantly,

  > is.na(NaN)
  [1] TRUE
  > is.nan(NA)
  [1] FALSE

so in R,  (the) NaN is rather a special case of NA.
Additionally, 'NA' is considerably faster to type than 'NaN'..
Consequently, I'd rather keep that.

Thanks again, Petr and Henrik, for your feedback!
Martin



More information about the R-devel mailing list