[Rd] identical(0, -0)

Duncan Murdoch murdoch at stats.uwo.ca
Fri Aug 7 17:25:11 CEST 2009


On 8/7/2009 10:46 AM, Martin Maechler wrote:
>>>>>> "TH" == Ted Harding <Ted.Harding at manchester.ac.uk>
>>>>>>     on Fri, 07 Aug 2009 14:49:54 +0100 (BST) writes:
> 
>     TH> On 07-Aug-09 11:07:08, Duncan Murdoch wrote:
>     >> Martin Maechler wrote:
>     >>>>>>>> William Dunlap <wdunlap at tibco.com>
>     >>>>>>>> on Thu, 6 Aug 2009 15:06:08 -0700 writes:
>     >>> >> -----Original Message----- From:
>     >>> >> r-help-bounces at r-project.org
>     >>> >> [mailto:r-help-bounces at r-project.org] On Behalf Of
>     >>> >> Giovanni Petris Sent: Thursday, August 06, 2009 3:00 PM
>     >>> >> To: milton.ruser at gmail.com Cc: r-help at r-project.org;
>     >>> >> Daniel.Gerlanc at geodecapital.com Subject: Re: [R] Why is 0
>     >>> >> not an integer?
>     >>> >> 
>     >>> >> 
>     >>> >> I ran an instant experiment...
>     >>> >> 
>     >>> >> > typeof(0) [1] "double" > typeof(-0) [1] "double" >
>     >>> >> identical(0, -0) [1] TRUE
>     >>> >> 
>     >>> >> Best, Giovanni
>     >>> 
>     >>> > But 0.0 and -0.0 have different reciprocals
>     >>> 
>     >>> >> 1.0/0.0
>     >>> >    [1] Inf
>     >>> >> 1.0/-0.0
>     >>> >    [1] -Inf
>     >>> 
>     >>> > Bill Dunlap TIBCO Software Inc - Spotfire Division wdunlap
>     >>> > tibco.com
>     >>> 
>     >>> yes.  {finally something interesting in this boring thread !}
>     ---> diverting to R-devel
>     >>> 
>     >>> In April, I've had a private e-mail communication with John
>     >>> Chambers [father of S, notably S4, which also brought identical()]
>     >>> and Bill about the topic,
>     >>> where I had started suggesting that  R  should be changed such
>     >>> that
>     >>> identical(-0. , +0.)
>     >>> would return FALSE.
>     >>> Bill did mention that it does so for (newish versions of) S+
>     >>> and that he'd prefer that, too,
>     >>> and John said
>     >>> 
>     >>> >> I agree on having a preference for a bitwise comparison for
>     >>> >> identical()---that's what the name means after all.  But since
>     >>> >> someone implemented the numerical case as the C == it's probably
>     >>> >> going to be more hassle than it's worth to change it.  But we
>     >>> >> should make the implementation clear in the documentation.
>     >>> 
>     >>> so in principle, we all agreed that R's identical() should be
>     >>> changed here, namely by using something like  memcmp() instead
>     >>> of simple '==' ,  however we haven't bothered to actually 
>     >>> *implement* this change.
>     >>> 
>     >>> I am currently testing a patch  which would lead to
>     >>> identical(0, -0)  return FALSE.
>     >>> 
>     >> I don't think that would be a good idea.  Other expressions besides
>     >> "-0" 
>     >> calculate the zero with the negative sign bit, e.g. the following
>     >> sequence:
>     >> 
>     >> pos <- 1
>     >> neg <- -1
>     >> zero <- 0
>     >> y <- zero*pos
>     >> z <- zero*neg
>     >> identical(y, z)
>     >> 
>     >> I think most R users would expect the last expression there to be
>     >> TRUE based on the previous two lines, given that pos and neg both
>     >> have finite values. In a simple case like this y == z would be a
>     >> better test to use, but if those were components of a larger
>     >> structure, identical() is all we've got, and people would waste a
>     >> lot of time tracking down why structures differing only in the
>     >> sign of zero were not identical, even though every element tested
>     >> equal.
> 
> identical()  *is* not the same as '=='  even if you think of a
> generalized '==',
> and your example is not convincing to me.

Fair enough, but after your change, how would one do what 
identical(list(pos, neg, zero, y), list(pos, neg, zero, z)) does now? 
That seems to me to be a more useful comparison than one that declares 
those to be unequal because the signs of y and z differ.

> 
> Further note that help(identical)  has always said
> 
>  > Description:
> 
>  >    The safe and reliable way to test two objects for being _exactly_
>  >    equal.  It returns 'TRUE' in this case, 'FALSE' in every other case.
> 
> which really should distinguish  -0 and +0
> 
> 
>     >> Duncan Murdoch
>     >>> Martin Maechler, ETH Zurich
> 
>     TH> My own view of this is that there may in certain cirumstances be an
>     TH> interest in distinguishing between 0 and (-0), yet normally most
>     TH> users will simply want to compare the numerical values.
> 
>     TH> Therefore I am in favour of revising identical() so that it can so
>     TH> distinguish; but also of taking the opportunity to give it a parameter
>     TH> say
> 
>     TH> identical(x,y,sign.bit=FALSE)
> 
>     TH> so that the default behaviour would be to see 0 and (-0) as identical,
>     TH> but with sign.bit=TRUE it would see the difference.
> 
>     TH> However, I put this forward in ignorance of
>     TH> a) Any difficulties that this may present in re-coding identical();
>     TH> b) Any complications that may arise when applying this new form
>     TH> to complex objects.
> 
> Your proposal would actually need to special case this one case,
> rather than my patch  which  replaces  using  '=='   (in C) for
> double by using  memcmp() instead,  something which is already
> used for several other cases there, and hence seems more
> consequent and in that way natural.
> 
> The one thing even the new code would not differentiate is the
> different  NaN's (apart from NA) but they are not differentiable
> on the R level either, AFAIK, at least AFAIU our language
> specifications, we only want two things: NA and NaN

I don't understand what you are proposing now.  The different NaN's have 
different bit patterns, so wouldn't memcmp() see a difference?  And 
taking your literalist point of view, the fact that it is hard to detect 
the difference at the R level (requiring C code support to do it) 
doesn't mean there is no difference, there's just a very subtle, rarely 
detectable difference, like the one between +0 and -0.

Duncan Murdoch

> 
> Martin
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list