[Rd] For integer vectors, `as(x, "numeric")` has no effect.
Martin Maechler
maechler at stat.math.ethz.ch
Tue Jan 5 10:31:00 CET 2016
>>>>> Josh O'Brien <joshmobrien at gmail.com>
>>>>> on Mon, 4 Jan 2016 16:16:51 -0800 writes:
> On Dec 19, 2015, at 3:32 AM, Martin Maechler <maechler at
stat.math.ethz.ch> wrote:
>>>>>>> Martin Maechler <maechler at stat.math.ethz.ch> on
>>>>>>> Sat, 12 Dec 2015 10:32:51 +0100 writes:
>>
>>>>>>> John Chambers <jmc at r-project.org> on Fri, 11 Dec
>>>>>>> 2015 10:11:05 -0800 writes:
>>
>>>> Somehow, the most obvious fixes are always
>>>> back-incompatible these days. The example intrigued
>>>> me, so I looked into it a bit (should have been doing
>>>> something else, but ....)
>>
>>>> You're right that this is the proverbial
>>>> thin-edge-of-the-wedge.
>>
>>>> The problem is in setDataPart(), which will be called
>>>> whenever a class extends one of the vector types.
>>
>>>> It does as(value, dataClass) The key point is that the
>>>> third argument to as(), strict=TRUE by default. So,
>>>> yes, the change will cause all integer vectors to
>>>> become double when the class extends "numeric".
>>>> Generally, strict=TRUE makes sense here and of course
>>>> changing THAT would open up yet more incompatibilities.
>>
>>>> For back compatibility, one would have to have some
>>>> special code in setDataPart() for the case of
>>>> integer/numeric.
>>
>>>> John
>>
>>>> (Historically, the original sin was probably not making
>>>> a distinction between "numeric" as a virtual class and
>>>> "double" as a type/class.)
>>
>>> Yes, indeed. In the mean time, I've seen more cases
>>> where "the change will cause all integer vectors to
>>> become double when the class extends "numeric". seems
>>> detrimental.
>>
>>> OTOH, I still think we could go in the right direction
>>> --- hopefully along the wishes of bioconductor S4
>>> development, see Martin Morgan's e-mail:
>>
>>> [This is all S4 - only; should not much affect base R /
>>> S3] Currently, "integer" is a subclass of "numeric" and
>>> so the "integer become double" part seems unwanted to
>>> me. OTOH, it would really make sense to more formally
>>> have the basic subclasses of "numeric" to be "integer"
>>> and "double", and to let as(*, "double") to become
>>> different to as(*, "numeric") [Again, this is just for
>>> the S4 classes and as() coercions, *not* e.g. for
>>> as.numeric() / as.double() !]
>>
>>> In the DEPRECATED part of the NEWS for R 2.7.0 (April
>>> 2008) we have had
>>
>>> o The S4 pseudo-classes "single" and double have been
>>> removed. (The S4 class for a REALSXP is "numeric": for
>>> back-compatibility as(x, "double") coerces to
>>> "numeric".)
>>
>>> I think the removal of "single" was fine, but in
>>> hindsight, maybe the removal of "double" -- which was
>>> partly broken then -- possibly could rather have been a
>>> fixup of "double" along the following
>>
>>> Current "thought experiment proposal" :
>>
>>> 1) "numeric" := {"integer", "double"} { class -
>>> subclasses } 2) as(1L, "numeric") continues to return 1L
>>> .. since integer is one case of "numeric" 3) as(1L,
>>> "double") newly returns 1.0 {and in fact would be
>>> "equivalent" to as.double(1L)}
>>
>>> After the above change, S4 as(*, "double") would
>>> correspond to S3 as.double but as(*, "numeric") would
>>> continue to differ from as.numeric(*), the former *not*
>>> changing integers to double.
>>
>>> Martin
>>
>> Also note that e.g.
>>
>> class(pi) would return "double" instead of "numeric"
>>
>> and this will break all the bad programming style usages
>> of
>>
>> if(class(x) == "numeric")
>>
>> which I tend to see in gazillions of user and even
>> package codes This bad (aka error prone !) because
>> "correct" usage would be
>>
>> if(inherits(x, "numeric"))
>>
>> and that of course would *not* break after the change
>> above.
>>
>> - - - -
>>
>> A week later, I'm still pretty convinced it would be
>> worth going in the direction proposed above.
>>
>> But I was actually hoping for some encouragement or
>> "mental support"... or then to hear why you think the
>> proposition is not good or not viable ...
>>
>>
> I really like Martin Maechler's "thought experiment
> proposal", but (based partly on the reception its gotten)
> figure I mustn't be appreciating the complications it
> would introduce..
Actually, I've spent half day implementing it and was very
pleased about it... as matter of fact it passed *all* our checks
also in all recommended packages (*)
To do it cleanly... with very few code changes,
the *only* consequence would be that
class(1.)
(and similar) then returned "double" instead of "numeric".
which *would* be logical consequent, because indeed,
numeric = {integer, double}
in that new scheme, and class(1L) also returns "integer".
To my big chagrin there was very big opposition such a change,
IIRC, mainly on the grounds that for 20 years or so S and then R
books and publications had said that double and numeric should
be basically the same.
(*) Below you have a C level proposal which as you note is
similar to John Chambers R level change:
The consequence is that basically you can no longer have "integer"
entries in "numeric" slots; they are automagically made into "double".
I personally find that not really "acceptable" {waste of storage},
and I would guess that more code "out there in package-land and
user-code" would break than with my change.
> That said, if it's decided to just make a smaller fix of
> as(x, "numeric"), might it be better to make the change at
> the C level, to R_set_class in $RHOME/src/main/coerce.c?
I'm not seeing the advantage to make the change there, apart
from possibly some efficiency gain.
For the time being, I will not work on this ... mainly as I still
believe that my proposal would lead to a much much cleaner setup
(and yes, even be worth some small changes in new editions of
those R books which deal with such subtle issues)
Martin
More information about the R-devel
mailing list