[Rd] For integer vectors, `as(x, "numeric")` has no effect.
Josh O'Brien
joshmobrien at gmail.com
Tue Jan 5 22:55:31 CET 2016
On Tue, Jan 5, 2016 at 1:31 AM, Martin Maechler
<maechler at stat.math.ethz.ch> wrote:
>>>>>> Josh O'Brien <joshmobrien at gmail.com>
>>>>>> on Mon, 4 Jan 2016 16:16:51 -0800 writes:
>
> > On Dec 19, 2015, at 3:32 AM, Martin Maechler <maechler at
> stat.math.ethz.ch> wrote:
>
> >>>>>>> Martin Maechler <maechler at stat.math.ethz.ch> on
> >>>>>>> Sat, 12 Dec 2015 10:32:51 +0100 writes:
> >>
> >>>>>>> John Chambers <jmc at r-project.org> on Fri, 11 Dec
> >>>>>>> 2015 10:11:05 -0800 writes:
> >>
> >>>> Somehow, the most obvious fixes are always
> >>>> back-incompatible these days. The example intrigued
> >>>> me, so I looked into it a bit (should have been doing
> >>>> something else, but ....)
> >>
> >>>> You're right that this is the proverbial
> >>>> thin-edge-of-the-wedge.
> >>
> >>>> The problem is in setDataPart(), which will be called
> >>>> whenever a class extends one of the vector types.
> >>
> >>>> It does as(value, dataClass) The key point is that the
> >>>> third argument to as(), strict=TRUE by default. So,
> >>>> yes, the change will cause all integer vectors to
> >>>> become double when the class extends "numeric".
> >>>> Generally, strict=TRUE makes sense here and of course
> >>>> changing THAT would open up yet more incompatibilities.
> >>
> >>>> For back compatibility, one would have to have some
> >>>> special code in setDataPart() for the case of
> >>>> integer/numeric.
> >>
> >>>> John
> >>
> >>>> (Historically, the original sin was probably not making
> >>>> a distinction between "numeric" as a virtual class and
> >>>> "double" as a type/class.)
> >>
> >>> Yes, indeed. In the mean time, I've seen more cases
> >>> where "the change will cause all integer vectors to
> >>> become double when the class extends "numeric". seems
> >>> detrimental.
> >>
> >>> OTOH, I still think we could go in the right direction
> >>> --- hopefully along the wishes of bioconductor S4
> >>> development, see Martin Morgan's e-mail:
> >>
> >>> [This is all S4 - only; should not much affect base R /
> >>> S3] Currently, "integer" is a subclass of "numeric" and
> >>> so the "integer become double" part seems unwanted to
> >>> me. OTOH, it would really make sense to more formally
> >>> have the basic subclasses of "numeric" to be "integer"
> >>> and "double", and to let as(*, "double") to become
> >>> different to as(*, "numeric") [Again, this is just for
> >>> the S4 classes and as() coercions, *not* e.g. for
> >>> as.numeric() / as.double() !]
> >>
> >>> In the DEPRECATED part of the NEWS for R 2.7.0 (April
> >>> 2008) we have had
> >>
> >>> o The S4 pseudo-classes "single" and double have been
> >>> removed. (The S4 class for a REALSXP is "numeric": for
> >>> back-compatibility as(x, "double") coerces to
> >>> "numeric".)
> >>
> >>> I think the removal of "single" was fine, but in
> >>> hindsight, maybe the removal of "double" -- which was
> >>> partly broken then -- possibly could rather have been a
> >>> fixup of "double" along the following
> >>
> >>> Current "thought experiment proposal" :
> >>
> >>> 1) "numeric" := {"integer", "double"} { class -
> >>> subclasses } 2) as(1L, "numeric") continues to return 1L
> >>> .. since integer is one case of "numeric" 3) as(1L,
> >>> "double") newly returns 1.0 {and in fact would be
> >>> "equivalent" to as.double(1L)}
> >>
> >>> After the above change, S4 as(*, "double") would
> >>> correspond to S3 as.double but as(*, "numeric") would
> >>> continue to differ from as.numeric(*), the former *not*
> >>> changing integers to double.
> >>
> >>> Martin
> >>
> >> Also note that e.g.
> >>
> >> class(pi) would return "double" instead of "numeric"
> >>
> >> and this will break all the bad programming style usages
> >> of
> >>
> >> if(class(x) == "numeric")
> >>
> >> which I tend to see in gazillions of user and even
> >> package codes This bad (aka error prone !) because
> >> "correct" usage would be
> >>
> >> if(inherits(x, "numeric"))
> >>
> >> and that of course would *not* break after the change
> >> above.
> >>
> >> - - - -
> >>
> >> A week later, I'm still pretty convinced it would be
> >> worth going in the direction proposed above.
> >>
> >> But I was actually hoping for some encouragement or
> >> "mental support"... or then to hear why you think the
> >> proposition is not good or not viable ...
> >>
> >>
>
> > I really like Martin Maechler's "thought experiment
> > proposal", but (based partly on the reception its gotten)
> > figure I mustn't be appreciating the complications it
> > would introduce..
>
> Actually, I've spent half day implementing it and was very
> pleased about it... as matter of fact it passed *all* our checks
> also in all recommended packages (*)
>
> To do it cleanly... with very few code changes,
> the *only* consequence would be that
>
> class(1.)
>
> (and similar) then returned "double" instead of "numeric".
> which *would* be logical consequent, because indeed,
>
> numeric = {integer, double}
>
> in that new scheme, and class(1L) also returns "integer".
>
> To my big chagrin there was very big opposition such a change,
> IIRC, mainly on the grounds that for 20 years or so S and then R
> books and publications had said that double and numeric should
> be basically the same.
>
> (*) Below you have a C level proposal which as you note is
> similar to John Chambers R level change:
>
> The consequence is that basically you can no longer have "integer"
> entries in "numeric" slots; they are automagically made into "double".
> I personally find that not really "acceptable" {waste of storage},
> and I would guess that more code "out there in package-land and
> user-code" would break than with my change.
>
> > That said, if it's decided to just make a smaller fix of
> > as(x, "numeric"), might it be better to make the change at
> > the C level, to R_set_class in $RHOME/src/main/coerce.c?
>
> I'm not seeing the advantage to make the change there, apart
> from possibly some efficiency gain.
>
One advantage (relative to a solution based on setting a new S4 coerce()
method for signature c("integer", "numeric") ) is that it would also make
the following conversion work as naively expected:
x <- 10L
class(x) <- "numeric"
class(x)
# [1] "integer" ## would be "numeric"
I know that's not a recommended strategy for converting an object's
class, but for users like me, trying to make sense of as() and the
class system, it would be even more perplexing if `as(x, "numeric")`
and `class(x) <- "numeric"` yielded different results.
> For the time being, I will not work on this ... mainly as I still
> believe that my proposal would lead to a much much cleaner setup
> (and yes, even be worth some small changes in new editions of
> those R books which deal with such subtle issues)
>
Thanks, anyway, for having looked into this. If no changes are to be made,
then it might (?) be worth modifying the "Basic Coercion Methods" section
of ?as. It currently reads:
Methods are pre-defined for coercing any object to one of the
basic datatypes. For example, 'as(x, "numeric")' uses the
existing 'as.numeric' function. These built-in methods can be
listed by 'showMethods("coerce")'.
which is not accurate for integer vectors 'x'.
> Martin<div id="DDB4FAA8-2DD7-40BB-A1B8-4E2AA1F9FDF2"><table style="border-top: 1px solid #aaabb6; margin-top: 10px;">
<tr>
<td style="width: 105px; padding-top: 15px;">
<a href="https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail"
target="_blank"><img
src="https://ipmcdn.avast.com/images/logo-avast-v1.png" style="width:
90px; height:33px;"/></a>
</td>
<td style="width: 470px; padding-top: 20px; color: #41424e;
font-size: 13px; font-family: Arial, Helvetica, sans-serif;
line-height: 18px;">This email has been sent from a virus-free
computer protected by Avast. <br /><a
href="https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail"
target="_blank" style="color: #4453ea;">www.avast.com</a>
</td>
</tr>
</table><a href="#DDB4FAA8-2DD7-40BB-A1B8-4E2AA1F9FDF2" width="1"
height="1"></a></div>
More information about the R-devel
mailing list