[R] [FORGED] Re: identical() versus sapply()

Duncan Murdoch murdoch.duncan at gmail.com
Tue Apr 12 04:45:44 CEST 2016


On 11/04/2016 10:18 PM, Bert Gunter wrote:
> "The documentation aims to be accurate, not necessarily clear."
>
> !!!
>
> I hope that is not the case! Accurate documentation that is confusing
> is not very useful.

I don't think it is ever intentionally confusing, but it is often 
concise to the point of obscurity.  Words are chosen carefully, and 
explanations are not repeated.  It takes an effort to read it.  It will 
be clear to careful readers, but not to all readers.

I was thinking of the statement quoted earlier, 'as(x, "numeric") uses 
the existing as.numeric function'.  That is different than saying 'as(x, 
"numeric") is the same as as.numeric(x)'.

Duncan Murdoch

  I understand that it is challenging to write docs
> that are both clear and accurate; but I hope that is always the goal.
>
> Cheers,
> Bert
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Mon, Apr 11, 2016 at 6:09 PM, Duncan Murdoch
> <murdoch.duncan at gmail.com> wrote:
>> On 11/04/2016 8:25 PM, Paulson, Ariel wrote:
>>>
>>> Hi Jeff,
>>>
>>>
>>> We are splitting hairs because R is splitting hairs, and causing us
>>> problems.  Integer and numeric are different R classes with different
>>> properties, mathematical relationships notwithstanding.  For instance, the
>>> counterintuitive result:
>>
>>
>> The issue here is that R has grown.  The as() function is newer than the
>> as.numeric() function, it's part of the methods package.  It is a much more
>> complicated thing, and there are cases where they differ.
>>
>> In this case, the problem is that is(1L, "numeric") evaluates to TRUE, and
>> nobody has written a coerce method that specifically converts "integer" to
>> "numeric".  So the as() function defaults to doing nothing.
>> It takes a while to do nothing, approximately 360 times longer than
>> as.numeric() takes to actually do the conversion:
>>
>>> microbenchmark(as.numeric(1L), as(1L, "numeric"))
>> Unit: nanoseconds
>>                expr   min    lq      mean  median       uq     max neval
>>      as.numeric(1L)   133   210    516.92   273.5    409.5    9444   100
>>   as(1L, "numeric") 51464 64501 119294.31 99768.5 138321.0 1313669   100
>>
>> R performance is not always simple and easy to predict, but I think anyone
>> who had experience with R would never use as(x, "numeric").  So this just
>> isn't a problem worth fixing.
>>
>> Now, you might object that the documentation claims they are equivalent, but
>> it certainly doesn't.  The documentation aims to be accurate, not
>> necessarily clear.
>>
>> Duncan Murdoch
>>
>>
>>>
>>>> identical(as.integer(1), as.numeric(1))
>>>
>>> [1] FALSE
>>>
>>>
>>> Unfortunately the reply-to chain doesn't extend far enough -- here is the
>>> original problem:
>>>
>>>
>>>> sapply(1, identical, 1)
>>>
>>> [1] TRUE
>>>
>>>> sapply(1:2, identical, 1)
>>>
>>> [1] FALSE FALSE
>>>
>>>> sapply(1:2, function(i) identical(as.numeric(i),1) )
>>>
>>> [1]  TRUE FALSE
>>>
>>>> sapply(1:2, function(i) identical(as(i,"numeric"),1) )
>>>
>>> [1] FALSE FALSE
>>>
>>> These are the results of R's hair-splitting!
>>
>>
>>
>>>
>>> Ariel
>>>
>>> ________________________________
>>> From: Jeff Newmiller <jdnewmil at dcn.davis.ca.us>
>>> Sent: Monday, April 11, 2016 6:49 PM
>>> To: Bert Gunter; Paulson, Ariel
>>> Cc: Rolf Turner; r-help at r-project.org
>>> Subject: Re: [R] [FORGED] Re: identical() versus sapply()
>>>
>>> Hypothesis regarding the thought process: integer is a perfect subset of
>>> numeric, so why split hairs?
>>> --
>>> Sent from my phone. Please excuse my brevity.
>>>
>>> On April 11, 2016 12:36:56 PM PDT, Bert Gunter <bgunter.4567 at gmail.com>
>>> wrote:
>>>
>>> Indeed!
>>>
>>> Slightly simplified to emphasize your point:
>>>
>>>    class(as(1:2,"numeric"))
>>> [1] "integer"
>>>
>>>    class(as.numeric(1:2))
>>> [1] "numeric"
>>>
>>> whereas in ?as it says:
>>>
>>> "Methods are pre-defined for coercing any object to one of the basic
>>> datatypes. For example, as(x, "numeric") uses the existing as.numeric
>>> function. "
>>>
>>> I suspect this is related to my ignorance of S4 classes (i.e. as() )
>>> and how they relate to S3 classes, but I certainly don't get it
>>> either.
>>>
>>> Cheers,
>>> Bert
>>>
>>>
>>>
>>> Bert Gunter
>>>
>>> "The trouble with having an open mind is that people keep coming along
>>> and sticking things
>>> into it."
>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>>
>>>
>>> On Mon, Apr 11, 2016 at 9:30 AM, Paulson, Ariel <apa at stowers.org> wrote:
>>>    Ok, I see the difference between 1 and 1:2, I'll just leave it as one of
>>> those "only in R" things.
>>>
>>>    But it seems then, that as.numeric() should guarantee a FALSE outcome,
>>> yet it does not.
>>>
>>>    To build on what Rolf pointed out, I would really love for someone to
>>> explain this one:
>>>
>>>    str(1)
>>>     num 1
>>>
>>>    str(1:2)
>>>     int [1:2] 1 2
>>>
>>>    str(as.numeric(1:2))
>>>     num [1:2] 1 2
>>>
>>>    str(as(1:2,"numeric"))
>>>     int [1:2] 1 2
>>>
>>>    Which doubly makes no sense.  1) Either the class is "numeric" or it
>>> isn't; I did not call as.integer() here.  2) method of recasting should not
>>> affect final class.
>>>
>>>    Thanks,
>>>    Ariel
>>>
>>>
>>>    -----Original Message-----
>>>    From: Rolf Turner [mailto:r.turner at auckland.ac.nz]
>>>    Sent: Saturday, April 09, 2016 5:27 AM
>>>    To: Jeff Newmiller
>>>    Cc: Paulson, Ariel; 'r-help at r-project.org'
>>>    Subject: Re: [FORGED] Re: [R] identical() versus sapply()
>>>
>>>    On 09/04/16 16:24, Jeff Newmiller wrote:
>>>    I highly
>>> recommend making friends with the str function. Try
>>>
>>>    str( 1 )
>>>    str( 1:2 )
>>>
>>>    Interesting.  But to me counter-intuitive.  Since R makes no distinction
>>> between scalars and vectors of length 1 (or more accurately I think, since
>>> in R there is *no such thing as a scalar*, only a vector of length
>>>    1) I don't see why "1" should be treated in a manner that is
>>> categorically different from the way in which "1:2" is treated.
>>>
>>>    Can you, or someone else with deep insight into R and its rationale,
>>> explain the basis for this difference in treatment?
>>>
>>>    for the clue you need, and then
>>>
>>>    sapply( 1:2, identical, 1L )
>>>
>>>    cheers,
>>>
>>>    Rolf
>>>
>>>    --
>>>    Technical Editor ANZJS
>>>    Department of Statistics
>>>    University of Auckland
>>>    Phone: +64-9-373-7599 ext. 88276
>>>
>>> ________________________________
>>>
>>>    R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>    https://stat.ethz.ch/mailman/listinfo/r-help
>>>    PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>>    and provide commented, minimal, self-contained, reproducible code.
>>>
>>>          [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>



More information about the R-help mailing list