[R] [FORGED] Re: identical() versus sapply()

John Kane jrkrideau at inbox.com
Tue Apr 12 13:35:32 CEST 2016




> -----Original Message-----
> From: bgunter.4567 at gmail.com
> Sent: Mon, 11 Apr 2016 19:18:39 -0700
> To: murdoch.duncan at gmail.com
> Subject: Re: [R] [FORGED] Re: identical() versus sapply()
> 
> "The documentation aims to be accurate, not necessarily clear."
> 
> !!!
> 
> I hope that is not the case! Accurate documentation that is confusing
> is not very useful. I understand that it is challenging to write docs
> that are both clear and accurate; but I hope that is always the goal.

I have lost the link but someone here had a lovely essay on R documentation which pointed out that one had to  have "faith" that everything was in the documentation.


> 
> Cheers,
> Bert
> Bert Gunter
> 
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> 
> 
> On Mon, Apr 11, 2016 at 6:09 PM, Duncan Murdoch
> <murdoch.duncan at gmail.com> wrote:
>> On 11/04/2016 8:25 PM, Paulson, Ariel wrote:
>>> 
>>> Hi Jeff,
>>> 
>>> 
>>> We are splitting hairs because R is splitting hairs, and causing us
>>> problems.  Integer and numeric are different R classes with different
>>> properties, mathematical relationships notwithstanding.  For instance,
>>> the
>>> counterintuitive result:
>> 
>> 
>> The issue here is that R has grown.  The as() function is newer than the
>> as.numeric() function, it's part of the methods package.  It is a much
>> more
>> complicated thing, and there are cases where they differ.
>> 
>> In this case, the problem is that is(1L, "numeric") evaluates to TRUE,
>> and
>> nobody has written a coerce method that specifically converts "integer"
>> to
>> "numeric".  So the as() function defaults to doing nothing.
>> It takes a while to do nothing, approximately 360 times longer than
>> as.numeric() takes to actually do the conversion:
>> 
>>> microbenchmark(as.numeric(1L), as(1L, "numeric"))
>> Unit: nanoseconds
>>               expr   min    lq      mean  median       uq     max neval
>>     as.numeric(1L)   133   210    516.92   273.5    409.5    9444   100
>>  as(1L, "numeric") 51464 64501 119294.31 99768.5 138321.0 1313669   100
>> 
>> R performance is not always simple and easy to predict, but I think
>> anyone
>> who had experience with R would never use as(x, "numeric").  So this
>> just
>> isn't a problem worth fixing.
>> 
>> Now, you might object that the documentation claims they are equivalent,
>> but
>> it certainly doesn't.  The documentation aims to be accurate, not
>> necessarily clear.
>> 
>> Duncan Murdoch
>> 
>> 
>>> 
>>>> identical(as.integer(1), as.numeric(1))
>>> 
>>> [1] FALSE
>>> 
>>> 
>>> Unfortunately the reply-to chain doesn't extend far enough -- here is
>>> the
>>> original problem:
>>> 
>>> 
>>>> sapply(1, identical, 1)
>>> 
>>> [1] TRUE
>>> 
>>>> sapply(1:2, identical, 1)
>>> 
>>> [1] FALSE FALSE
>>> 
>>>> sapply(1:2, function(i) identical(as.numeric(i),1) )
>>> 
>>> [1]  TRUE FALSE
>>> 
>>>> sapply(1:2, function(i) identical(as(i,"numeric"),1) )
>>> 
>>> [1] FALSE FALSE
>>> 
>>> These are the results of R's hair-splitting!
>> 
>> 
>> 
>>> 
>>> Ariel
>>> 
>>> ________________________________
>>> From: Jeff Newmiller <jdnewmil at dcn.davis.ca.us>
>>> Sent: Monday, April 11, 2016 6:49 PM
>>> To: Bert Gunter; Paulson, Ariel
>>> Cc: Rolf Turner; r-help at r-project.org
>>> Subject: Re: [R] [FORGED] Re: identical() versus sapply()
>>> 
>>> Hypothesis regarding the thought process: integer is a perfect subset
>>> of
>>> numeric, so why split hairs?
>>> --
>>> Sent from my phone. Please excuse my brevity.
>>> 
>>> On April 11, 2016 12:36:56 PM PDT, Bert Gunter <bgunter.4567 at gmail.com>
>>> wrote:
>>> 
>>> Indeed!
>>> 
>>> Slightly simplified to emphasize your point:
>>> 
>>>   class(as(1:2,"numeric"))
>>> [1] "integer"
>>> 
>>>   class(as.numeric(1:2))
>>> [1] "numeric"
>>> 
>>> whereas in ?as it says:
>>> 
>>> "Methods are pre-defined for coercing any object to one of the basic
>>> datatypes. For example, as(x, "numeric") uses the existing as.numeric
>>> function. "
>>> 
>>> I suspect this is related to my ignorance of S4 classes (i.e. as() )
>>> and how they relate to S3 classes, but I certainly don't get it
>>> either.
>>> 
>>> Cheers,
>>> Bert
>>> 
>>> 
>>> 
>>> Bert Gunter
>>> 
>>> "The trouble with having an open mind is that people keep coming along
>>> and sticking things
>>> into it."
>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>> 
>>> 
>>> On Mon, Apr 11, 2016 at 9:30 AM, Paulson, Ariel <apa at stowers.org>
>>> wrote:
>>>   Ok, I see the difference between 1 and 1:2, I'll just leave it as one
>>> of
>>> those "only in R" things.
>>> 
>>>   But it seems then, that as.numeric() should guarantee a FALSE
>>> outcome,
>>> yet it does not.
>>> 
>>>   To build on what Rolf pointed out, I would really love for someone to
>>> explain this one:
>>> 
>>>   str(1)
>>>    num 1
>>> 
>>>   str(1:2)
>>>    int [1:2] 1 2
>>> 
>>>   str(as.numeric(1:2))
>>>    num [1:2] 1 2
>>> 
>>>   str(as(1:2,"numeric"))
>>>    int [1:2] 1 2
>>> 
>>>   Which doubly makes no sense.  1) Either the class is "numeric" or it
>>> isn't; I did not call as.integer() here.  2) method of recasting should
>>> not
>>> affect final class.
>>> 
>>>   Thanks,
>>>   Ariel
>>> 
>>> 
>>>   -----Original Message-----
>>>   From: Rolf Turner [mailto:r.turner at auckland.ac.nz]
>>>   Sent: Saturday, April 09, 2016 5:27 AM
>>>   To: Jeff Newmiller
>>>   Cc: Paulson, Ariel; 'r-help at r-project.org'
>>>   Subject: Re: [FORGED] Re: [R] identical() versus sapply()
>>> 
>>>   On 09/04/16 16:24, Jeff Newmiller wrote:
>>>   I highly
>>> recommend making friends with the str function. Try
>>> 
>>>   str( 1 )
>>>   str( 1:2 )
>>> 
>>>   Interesting.  But to me counter-intuitive.  Since R makes no
>>> distinction
>>> between scalars and vectors of length 1 (or more accurately I think,
>>> since
>>> in R there is *no such thing as a scalar*, only a vector of length
>>>   1) I don't see why "1" should be treated in a manner that is
>>> categorically different from the way in which "1:2" is treated.
>>> 
>>>   Can you, or someone else with deep insight into R and its rationale,
>>> explain the basis for this difference in treatment?
>>> 
>>>   for the clue you need, and then
>>> 
>>>   sapply( 1:2, identical, 1L )
>>> 
>>>   cheers,
>>> 
>>>   Rolf
>>> 
>>>   --
>>>   Technical Editor ANZJS
>>>   Department of Statistics
>>>   University of Auckland
>>>   Phone: +64-9-373-7599 ext. 88276
>>> 
>>> ________________________________
>>> 
>>>   R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>   https://stat.ethz.ch/mailman/listinfo/r-help
>>>   PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>>   and provide commented, minimal, self-contained, reproducible code.
>>> 
>>>         [[alternative HTML version deleted]]
>>> 
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>> 
>> 
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

____________________________________________________________
Receive Notifications of Incoming Messages
Easily monitor multiple email accounts & access them with a click.
Visit http://www.inbox.com/notifier and check it out!



More information about the R-help mailing list