[Rd] identical(0, -0)

Tue Aug 25 13:38:59 CEST 2009

>>>>> "HenrikB" == Henrik Bengtsson <hb at stat.berkeley.edu>
>>>>>     on Sat, 22 Aug 2009 08:34:51 -0700 writes:

    HenrikB> On Sat, Aug 22, 2009 at 1:22 AM, Petr Savicky<savicky at cs.cas.cz> wrote:
    >> On Sat, Aug 22, 2009 at 12:00:44AM +0200, Martin Maechler wrote:
    >>> I have taken up the issue now,
    >>> and after thinking, studying the source, trying to define a
    >>> 'method = <string>' argument, came to the conclusion that both
    >>> the implementation and documentation (and source code "self-explanation")
    >>> are easiest to program, maintain, and understand,
    >>> if I introduce explicit binary switches,
    >>> so I now  propose the following R-level interface which keeps
    >>> the current behavior the default:
    >>> 
    >>> >> Usage:
    >>> >>
    >>> >>      identical(x, y, num.EQ = TRUE, one.NA = TRUE, attrib.asSet = TRUE)
    >>> >>
    >>> >> Arguments:
    >>> >>
    >>> >>     x, y: any R objects.
    >>> >>
    >>> >>   num.EQ: logical indicating if ('double' and 'complex' non-'NA')
    >>> >>           numbers should be compared using '==', or by bitwise
    >>> >>           comparison.  The latter (non-default) differentiates between
    >>> >>           '-0' and '+0'.
    >>> >>
    >>> >>   one.NA: logical indicating if there is conceptually just one numeric
    >>> >>           'NA' and one 'NaN';  'one.NA = FALSE' differentiates bit
    >>> >>           patterns.
    >>> >>
    >>> >> attrib.asSet: logical indicating if 'attributes' of 'x' and 'y' should
    >>> >>           be treated as _unordered_ tagged pairlists ("sets"); this
    >>> >>           currently also applies to 'slot's of S4 objects.  It may well
    >>> >>           be too strict to set 'attrib.asSet = FALSE'.

    HenrikB> My only comment is to make the argument notation a bit more consistent:

    HenrikB> (num.Eq, one.NA, attrib.as.set)

    HenrikB> or

    HenrikB> (numEq, oneNA, attribAsSet)

thank you.  I think I'd prefer the (older style) with "."
{and I had only one "." in all options},
and yes, I agree that these are a bit more consistent.

    HenrikB> Also, maybe "single" instead of "one".

yeaah.. that's possibly better...  
Other votes (on this part)?

    >> I appreciate having several binary switches. Besides the arguments above,
    >> this is also useful for an interactive use of identical(), for example,
    >> for debugging purposes. If there is a difference between objects, then
    >> the switches allow to get more information concerning what is the type
    >> of the difference.

exactly, thanks..

    >>> I'm open for better names of arguments, but will not accept "_"
    >>> in the argument names {just my taste; no reason for argueing...}.
    >> 
    >> I would slightly prefere one.NaN instead of one.NA. 

    >> In IEEE 754 terminology, R's 'NA's are a subset of
    >> 'NaN's. So. NaN is a bit more general notion, although in
    >> R, the sets of 'NA's an 'NaN's are disjoint. Moreover,
    >> the name one.NaN specifies more clearly, that the issue
    >> is important only for numeric types and not, for example,
    >> for integer.
    >> 
    >> Petr.

You are right of course  about  IEEE NaN's,
and also the fact that there *are* non-numeric NAs in R.

However, in the R world,  'NA' is much more known to users,
and much more importantly,

  > is.na(NaN)
  [1] TRUE
  > is.nan(NA)
  [1] FALSE

so in R,  (the) NaN is rather a special case of NA.
Additionally, 'NA' is considerably faster to type than 'NaN'..
Consequently, I'd rather keep that.

Thanks again, Petr and Henrik, for your feedback!
Martin