[Rd] as(1:4, "numeric") versus as.numeric(1:4, "numeric")

John Chambers jmc at r-project.org
Fri Apr 2 02:07:00 CEST 2010


The problem that you have exposed is that if one uses the *standard* 
form of selectMethod() on function "coerce", this could corrupt the 
intended set of methods used by as().  Of course, no one was expected to 
do this, but it's not caught or warned (as opposed to a direct call to 
coerce(), which does generate a warning).

If people think this is something of sufficient importance to put high 
on the fix-it list, contributions are welcome as always.

However, it seems a really bad idea to start making the definition of 
method selection by inheritance depend in an arbitrary way on which 
function is operated on.  Documenting what selection does is hard enough 
as it is.

A solution localized to the as() computation is to treat the mechanism 
involved in a call to setAsMethod  as something special, and provide 
whatever information is needed via showAsMethods(), or similar.  From a 
tutorial view, it might be good to emphasize that this is NOT the usual 
method dispatch--indeed, the present discussion supports that view.

Method selection in a functional language is a difficult concept, 
particularly for programmers coming from a standard OOP background.  If 
we're going to change it, let's aim to make it simpler, not more 
complicated.  What about getting rid of the kludgy argument 
useInheritance= in a future version, if nobody has a use for it other 
than in as()?  If you look at the code, you'll see that would simplify 
it significantly, and even speed up selection somewhat. There's a change 
I would be happy about!

John

On 4/1/10 2:59 PM, Hervé Pagès wrote:
> John Chambers wrote:
>> The point I was making was that as() is not just a synonym for 
>> selecting a method from coerce() by the usual inheritance rules.  I 
>> don't believe it should be, and the documentation emphasizes that 
>> inheritance is not used in the ordinary way.
>
> I got this. If you look carefully at the change I'm suggesting for
> selectMethod(), you will notice that I said that f="coerce" would
> then need to become a special case.
> In other words, when f="coerce", the usual inheritance rules are 
> replaced by the rules that are currently implemented in as() and
> described in its man page.
> So to summarize: (1) the code in as() that is currently in charge of
> selecting/creating/caching the most adequate coerce method is moved
> to selectMethod(), (2) the sections in ?as that describe the rules
> of this non-standard inheritance are moved to ?selectMethod.
>
>>
>> If one were to start rewriting code (which I'm not suggesting) my 
>> preference would be to  have coerce() not be a generic function, 
>> eliminating the offending selectMethod() calls.
>
> Then how one would know what as() is doing *exactly* i.e. which
> coerce method was used or will be used in such or such situation?
> showMethods()/selectMethod() are great tools because they allow the
> developer to predict things and troubleshoot.
>
> If you don't like putting the non-standard inheritance rules in
> selectMethod() (when f="coerce") then you can always add something
> like selectAsMethod() and put them here, and also add something
> like showAsMethods(). I guess that's more or less what you are
> saying when you propose to have coerce() not be a generic function,
> at least not an usual one.
> But it's important to expose selectAsMethod()/showAsMethods() to
> the user. We absolutely need them!
>
> Now I'm not sure I understand your concern about putting this
> stuff in the existing selectMethod()/showMethods(). Why not just
> ignore the useInheritance= arg when f="coerce"? Again, this would
> be a special case anyway (and documented). The advantage of this
> solution (over selectAsMethod()/showAsMethods()) is to avoid having
> to introduce and expose 2 new names, so the user doesn't have to
> switch between select*/show* tools depending on whether f="coerce"
> or not.
>
> H.
>
>>
>> John
>>
>>
>> On 4/1/10 12:31 AM, Hervé Pagès wrote:
>>> Hi John,
>>>
>>> John Chambers wrote:
>>>> The example is confusing and debatable, but not an obvious bug.  And
>>>> your presentation of it is the cause of much of the confusion
>>>> (unintentionally I'm sure).
>>>>
>>>> First, slipping from the as() function to methods for the coerce()
>>>> function might surprise a less experienced user.  And in fact, that
>>>> is the point here.  If you look at the as() function, it jumps
>>>> through several hoops and in particular selects a method from coerce
>>>> in such a way as NOT to use inheritance on the from= argument.  (I
>>>> think this makes sense in this case).  So I would assert that your
>>>> selectMethod() output below came from a different session than the
>>>> as(1:4, "numeric").
>>>>
>>>> Starting from a clean session with R 2.10.1:
>>>>
>>>> > class(as(1:4,"numeric"))
>>>> [1] "integer"
>>>> > selectMethod("coerce", c("integer","numeric"))
>>>> Method Definition:
>>>>
>>>> function (from, to = "numeric", strict = TRUE)
>>>> if (strict) {
>>>>     class(from) <- "numeric"
>>>>     from
>>>> } else from
>>>> <environment: namespace:methods>
>>>>
>>>> Signatures:
>>>>         from      to
>>>> target  "integer" "numeric"
>>>> defined "integer" "numeric"
>>>>
>>>> Note, no call to as.numeric().  In a session without a previous call
>>>> to as(), your selectMethod() call triggered a standard inherited
>>>> method selection.  And if you had then gone on to as(), the result
>>>> would have been different.
>>>
>>> Yes indeed. From a fresh start:
>>>
>>> > invisible(selectMethod("coerce", c("integer","numeric")))
>>> > class(as(1:4, "numeric"))
>>> [1] "numeric"
>>>
>>> But without the initial call to selectMethod(), as(1:4, "numeric")
>>> returns an integer vector.
>>>
>>> Sorry but it's hard for me to understand the reasons for having
>>> such behaviour, especially when selectMethod() is described as a
>>> function "to *look* for a method corresponding to a given generic
>>> function and signature". Apparently it does more than just looking...
>>>
>>>>
>>>> In a different clean session:
>>>>
>>>>
>>>> > getMethod("coerce", c("integer", "numeric"))
>>>> Error in getMethod("coerce", c("integer", "numeric")) :
>>>>   No method found for function "coerce" and signature integer, numeric
>>>> > selectMethod("coerce", c("integer", "numeric"))
>>>> Method Definition:
>>>>
>>>> function (from, to, strict = TRUE)
>>>> {
>>>>     value <- as.numeric(from)
>>>>     if (strict)
>>>>         attributes(value) <- NULL
>>>>     value
>>>> }
>>>> <environment: namespace:methods>
>>>>
>>>> Signatures:
>>>>         from      to
>>>> target  "integer" "numeric"
>>>> defined "ANY"     "numeric"
>>>> > class(as(1:4,"numeric"))
>>>> [1] "numeric"
>>>>
>>>> No argument about this being confusing.  Perhaps one should prohibit
>>>> standard selectMethod() on coerce() but that seems a bit arcane to
>>>> thwart folks like you!
>>>>
>>>> Suggested improvements for the current implementation are welcome, so
>>>> long as they consider the best definition of as() in the general 
>>>> sense.
>>>
>>> So one problem seems to be that, on a fresh start, *both*
>>>     as(1:4, "numeric")
>>> and
>>>     selectMethod("coerce", c("integer", "numeric"))
>>> will cache a coerce method for the c("integer", "numeric") signature,
>>> but they don't cache the *same* method!
>>>
>>> The automatic method cached by 'as(1:4, "numeric")' seems to be
>>> coming from:
>>>
>>>   getClassDef("integer")@contains$numeric at coerce
>>>
>>> Maybe one way to improve things would be to modify this part of
>>> the class definition for "integer" so it is in sync with
>>>
>>>   selectMethod("coerce", c("integer", "numeric")).
>>>
>>> There are other situations where the coerce methods are not
>>> in sync:
>>>
>>> > getClassDef("factor")@contains$integer at coerce
>>>   function (from, strict = TRUE)
>>>   {
>>>     attributes(from) <- NULL
>>>     from
>>>   }
>>> <environment: namespace:methods>
>>>
>>> > selectMethod("coerce", c("factor", "integer"))
>>>   Method Definition:
>>>
>>>   function (from, to, strict = TRUE)
>>>   {
>>>     value <- as.integer(from)
>>>     if (strict)
>>>         attributes(value) <- NULL
>>>     value
>>>   }
>>> <environment: namespace:methods>
>>>
>>> That isn't a problem here because both methods will produce
>>> the same result but is there any reason why the former
>>> couldn't use the same code as the latter?
>>>
>>> A more radical approach would be to have a call to
>>>
>>>   selectMethod("coerce", c("integer", "numeric"))
>>>
>>> have the same effect on the table of coerce methods than a
>>> call to
>>>
>>>   as(1:4, "numeric")
>>>
>>> i.e. the former will insert the same automatic method as the
>>> latter. That means that all the hard work made by the as()
>>> function in order to find/create/cache an appropriate method
>>> would need to be moved to selectMethod() so in that function
>>> 'f="coerce"' would become a special case.
>>> Then as() would become a 10 line function (or less) that would
>>> basically delegate to selectMethod("coerce", ...) to do the hard
>>> work. This solution seems better to me as it would then guarantee
>>> consistency between what as() does and what
>>> selectMethod("coerce", ...) says.
>>>
>>> Cheers,
>>> H.
>>>
>>>>
>>>> Regards,
>>>>   John
>>>>
>>>> On 3/31/10 3:52 PM, Hervé Pagès wrote:
>>>>> Hi,
>>>>>
>>>>> > class(as(1:4, "numeric"))
>>>>>   [1] "integer"
>>>>>
>>>>> Surprising but an explanation could be that an integer
>>>>> vector being a particular case of numeric vector, this
>>>>> coercion has nothing to do because 1:4 is already numeric.
>>>>> And indeed:
>>>>>
>>>>> > is.numeric(1:4)
>>>>>   [1] TRUE
>>>>> > is.numeric(as(1:4, "numeric"))
>>>>>   [1] TRUE
>>>>>
>>>>> However, 'as(1:4, "numeric")' is inconsistent with
>>>>>
>>>>> > class(as.numeric(1:4))
>>>>>   [1] "numeric"
>>>>>
>>>>> And, even more confusing, if you look at the coerce,ANY,numeric
>>>>> method:
>>>>>
>>>>> > selectMethod("coerce", c("integer", "numeric"))
>>>>>   Method Definition:
>>>>>
>>>>>   function (from, to, strict = TRUE)
>>>>>   {
>>>>>     value <- as.numeric(from)
>>>>>     if (strict)
>>>>>         attributes(value) <- NULL
>>>>>     value
>>>>>   }
>>>>> <environment: namespace:methods>
>>>>>
>>>>>   Signatures:
>>>>>           from      to
>>>>>   target  "integer" "numeric"
>>>>>   defined "ANY"     "numeric"
>>>>>
>>>>> it calls as.numeric()!
>>>>>
>>>>> So how can 'as(1:4, "numeric")' not return the same thing as
>>>>> 'as.numeric(1:4)' looks like a mystery to me. Could it be
>>>>> conceivable that I found a bug?
>>>>>
>>>>> Cheers,
>>>>> H.
>>>>>
>>>>>
>>>>> > sessionInfo()
>>>>> R version 2.11.0 Under development (unstable) (2010-03-15 r51282)
>>>>> x86_64-unknown-linux-gnu
>>>>>
>>>>> locale:
>>>>>  [1] LC_CTYPE=en_CA.UTF-8       LC_NUMERIC=C
>>>>>  [3] LC_TIME=en_CA.UTF-8        LC_COLLATE=en_CA.UTF-8
>>>>>  [5] LC_MONETARY=C              LC_MESSAGES=en_CA.UTF-8
>>>>>  [7] LC_PAPER=en_CA.UTF-8       LC_NAME=C
>>>>>  [9] LC_ADDRESS=C               LC_TELEPHONE=C
>>>>> [11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C
>>>>>
>>>>> attached base packages:
>>>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>>>>
>>>>>
>>>
>



More information about the R-devel mailing list