[Rd] identical(0, -0)
Martin Maechler
maechler at stat.math.ethz.ch
Fri Aug 7 17:41:08 CEST 2009
>>>>> "DM" == Duncan Murdoch <murdoch at stats.uwo.ca>
>>>>> on Fri, 07 Aug 2009 11:25:11 -0400 writes:
DM> On 8/7/2009 10:46 AM, Martin Maechler wrote:
>>>>>>> "TH" == Ted Harding <Ted.Harding at manchester.ac.uk>
>>>>>>> on Fri, 07 Aug 2009 14:49:54 +0100 (BST) writes:
>>
TH> On 07-Aug-09 11:07:08, Duncan Murdoch wrote:
>> >> Martin Maechler wrote:
>> >>>>>>>> William Dunlap <wdunlap at tibco.com>
>> >>>>>>>> on Thu, 6 Aug 2009 15:06:08 -0700 writes:
>> >>> >> -----Original Message----- From:
>> >>> >> r-help-bounces at r-project.org
>> >>> >> [mailto:r-help-bounces at r-project.org] On Behalf Of
>> >>> >> Giovanni Petris Sent: Thursday, August 06, 2009 3:00 PM
>> >>> >> To: milton.ruser at gmail.com Cc: r-help at r-project.org;
>> >>> >> Daniel.Gerlanc at geodecapital.com Subject: Re: [R] Why is 0
>> >>> >> not an integer?
>> >>> >>
>> >>> >>
>> >>> >> I ran an instant experiment...
>> >>> >>
>> >>> >> > typeof(0) [1] "double" > typeof(-0) [1] "double" >
>> >>> >> identical(0, -0) [1] TRUE
>> >>> >>
>> >>> >> Best, Giovanni
>> >>>
>> >>> > But 0.0 and -0.0 have different reciprocals
>> >>>
>> >>> >> 1.0/0.0
>> >>> > [1] Inf
>> >>> >> 1.0/-0.0
>> >>> > [1] -Inf
>> >>>
>> >>> > Bill Dunlap TIBCO Software Inc - Spotfire Division wdunlap
>> >>> > tibco.com
>> >>>
>> >>> yes. {finally something interesting in this boring thread !}
---> diverting to R-devel
>> >>>
>> >>> In April, I've had a private e-mail communication with John
>> >>> Chambers [father of S, notably S4, which also brought identical()]
>> >>> and Bill about the topic,
>> >>> where I had started suggesting that R should be changed such
>> >>> that
>> >>> identical(-0. , +0.)
>> >>> would return FALSE.
>> >>> Bill did mention that it does so for (newish versions of) S+
>> >>> and that he'd prefer that, too,
>> >>> and John said
>> >>>
>> >>> >> I agree on having a preference for a bitwise comparison for
>> >>> >> identical()---that's what the name means after all. But since
>> >>> >> someone implemented the numerical case as the C == it's probably
>> >>> >> going to be more hassle than it's worth to change it. But we
>> >>> >> should make the implementation clear in the documentation.
>> >>>
>> >>> so in principle, we all agreed that R's identical() should be
>> >>> changed here, namely by using something like memcmp() instead
>> >>> of simple '==' , however we haven't bothered to actually
>> >>> *implement* this change.
>> >>>
>> >>> I am currently testing a patch which would lead to
>> >>> identical(0, -0) return FALSE.
>> >>>
>> >> I don't think that would be a good idea. Other expressions besides
>> >> "-0"
>> >> calculate the zero with the negative sign bit, e.g. the following
>> >> sequence:
>> >>
>> >> pos <- 1
>> >> neg <- -1
>> >> zero <- 0
>> >> y <- zero*pos
>> >> z <- zero*neg
>> >> identical(y, z)
>> >>
>> >> I think most R users would expect the last expression there to be
>> >> TRUE based on the previous two lines, given that pos and neg both
>> >> have finite values. In a simple case like this y == z would be a
>> >> better test to use, but if those were components of a larger
>> >> structure, identical() is all we've got, and people would waste a
>> >> lot of time tracking down why structures differing only in the
>> >> sign of zero were not identical, even though every element tested
>> >> equal.
>>
>> identical() *is* not the same as '==' even if you think of a
>> generalized '==',
>> and your example is not convincing to me.
DM> Fair enough, but after your change, how would one do what
DM> identical(list(pos, neg, zero, y), list(pos, neg, zero, z)) does now?
DM> That seems to me to be a more useful comparison than one that declares
DM> those to be unequal because the signs of y and z differ.
Maybe something like
all(mapply(`==`, list(pos, neg, zero, y), list(pos, neg, zero, z)))
## or even
isTRUE(all.equal( list(pos, neg, zero, y), list(pos, neg, zero, z),
tol = 0))
the latter of which is more flexible adaptable at what the user
is really wanting to test.
>> Further note that help(identical) has always said
>>
>> > Description:
>>
>> > The safe and reliable way to test two objects for being _exactly_
>> > equal. It returns 'TRUE' in this case, 'FALSE' in every other case.
>>
>> which really should distinguish -0 and +0
>>
>>
>> >> Duncan Murdoch
>> >>> Martin Maechler, ETH Zurich
>>
TH> My own view of this is that there may in certain cirumstances be an
TH> interest in distinguishing between 0 and (-0), yet normally most
TH> users will simply want to compare the numerical values.
>>
TH> Therefore I am in favour of revising identical() so that it can so
TH> distinguish; but also of taking the opportunity to give it a parameter
TH> say
>>
TH> identical(x,y,sign.bit=FALSE)
>>
TH> so that the default behaviour would be to see 0 and (-0) as identical,
TH> but with sign.bit=TRUE it would see the difference.
>>
TH> However, I put this forward in ignorance of
TH> a) Any difficulties that this may present in re-coding identical();
TH> b) Any complications that may arise when applying this new form
TH> to complex objects.
>>
>> Your proposal would actually need to special case this one case,
>> rather than my patch which replaces using '==' (in C) for
>> double by using memcmp() instead, something which is already
>> used for several other cases there, and hence seems more
>> consequent and in that way natural.
>>
>> The one thing even the new code would not differentiate is the
>> different NaN's (apart from NA) but they are not differentiable
>> on the R level either, AFAIK, at least AFAIU our language
>> specifications, we only want two things: NA and NaN
DM> I don't understand what you are proposing now. The different NaN's have
DM> different bit patterns, so wouldn't memcmp() see a difference? And
DM> taking your literalist point of view, the fact that it is hard to detect
DM> the difference at the R level (requiring C code support to do it)
DM> doesn't mean there is no difference, there's just a very subtle, rarely
DM> detectable difference, like the one between +0 and -0.
DM> Duncan Murdoch
>>
>> Martin
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
More information about the R-devel
mailing list