[R] [FORGED] Re: identical() versus sapply()

Paulson, Ariel apa at stowers.org
Tue Apr 12 04:06:12 CEST 2016


Hi Duncan,

That explains it, thanks!

I rarely use as(), but had thought in this case, replacing identical(x, y) with identical(x, as(y,class(x))) could be an sapply-friendly way to iron out class differences -- then noticed the inexplicable result.  But now I know about all.equal().

Thanks,
Ariel 

________________________________________
From: Duncan Murdoch <murdoch.duncan at gmail.com>
Sent: Monday, April 11, 2016 8:09 PM
To: Paulson, Ariel; Jeff Newmiller; Bert Gunter
Cc: r-help at r-project.org
Subject: Re: [R] [FORGED] Re: identical() versus sapply()

On 11/04/2016 8:25 PM, Paulson, Ariel wrote:
> Hi Jeff,
>
>
> We are splitting hairs because R is splitting hairs, and causing us problems.  Integer and numeric are different R classes with different properties, mathematical relationships notwithstanding.  For instance, the counterintuitive result:

The issue here is that R has grown.  The as() function is newer than the
as.numeric() function, it's part of the methods package.  It is a much
more complicated thing, and there are cases where they differ.

In this case, the problem is that is(1L, "numeric") evaluates to TRUE,
and nobody has written a coerce method that specifically converts
"integer" to "numeric".  So the as() function defaults to doing nothing.
It takes a while to do nothing, approximately 360 times longer than
as.numeric() takes to actually do the conversion:

 > microbenchmark(as.numeric(1L), as(1L, "numeric"))
Unit: nanoseconds
               expr   min    lq      mean  median       uq     max neval
     as.numeric(1L)   133   210    516.92   273.5    409.5    9444   100
  as(1L, "numeric") 51464 64501 119294.31 99768.5 138321.0 1313669   100

R performance is not always simple and easy to predict, but I think
anyone who had experience with R would never use as(x, "numeric").  So
this just isn't a problem worth fixing.

Now, you might object that the documentation claims they are equivalent,
but it certainly doesn't.  The documentation aims to be accurate, not
necessarily clear.

Duncan Murdoch

>
>> identical(as.integer(1), as.numeric(1))
> [1] FALSE
>
>
> Unfortunately the reply-to chain doesn't extend far enough -- here is the original problem:
>
>
>> sapply(1, identical, 1)
> [1] TRUE
>
>> sapply(1:2, identical, 1)
> [1] FALSE FALSE
>
>> sapply(1:2, function(i) identical(as.numeric(i),1) )
> [1]  TRUE FALSE
>
>> sapply(1:2, function(i) identical(as(i,"numeric"),1) )
> [1] FALSE FALSE
>
> These are the results of R's hair-splitting!


>
> Ariel
>
> ________________________________
> From: Jeff Newmiller <jdnewmil at dcn.davis.ca.us>
> Sent: Monday, April 11, 2016 6:49 PM
> To: Bert Gunter; Paulson, Ariel
> Cc: Rolf Turner; r-help at r-project.org
> Subject: Re: [R] [FORGED] Re: identical() versus sapply()
>
> Hypothesis regarding the thought process: integer is a perfect subset of numeric, so why split hairs?
> --
> Sent from my phone. Please excuse my brevity.
>
> On April 11, 2016 12:36:56 PM PDT, Bert Gunter <bgunter.4567 at gmail.com> wrote:
>
> Indeed!
>
> Slightly simplified to emphasize your point:
>
>   class(as(1:2,"numeric"))
> [1] "integer"
>
>   class(as.numeric(1:2))
> [1] "numeric"
>
> whereas in ?as it says:
>
> "Methods are pre-defined for coercing any object to one of the basic
> datatypes. For example, as(x, "numeric") uses the existing as.numeric
> function. "
>
> I suspect this is related to my ignorance of S4 classes (i.e. as() )
> and how they relate to S3 classes, but I certainly don't get it
> either.
>
> Cheers,
> Bert
>
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things
> into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Mon, Apr 11, 2016 at 9:30 AM, Paulson, Ariel <apa at stowers.org> wrote:
>   Ok, I see the difference between 1 and 1:2, I'll just leave it as one of those "only in R" things.
>
>   But it seems then, that as.numeric() should guarantee a FALSE outcome, yet it does not.
>
>   To build on what Rolf pointed out, I would really love for someone to explain this one:
>
>   str(1)
>    num 1
>
>   str(1:2)
>    int [1:2] 1 2
>
>   str(as.numeric(1:2))
>    num [1:2] 1 2
>
>   str(as(1:2,"numeric"))
>    int [1:2] 1 2
>
>   Which doubly makes no sense.  1) Either the class is "numeric" or it isn't; I did not call as.integer() here.  2) method of recasting should not affect final class.
>
>   Thanks,
>   Ariel
>
>
>   -----Original Message-----
>   From: Rolf Turner [mailto:r.turner at auckland.ac.nz]
>   Sent: Saturday, April 09, 2016 5:27 AM
>   To: Jeff Newmiller
>   Cc: Paulson, Ariel; 'r-help at r-project.org'
>   Subject: Re: [FORGED] Re: [R] identical() versus sapply()
>
>   On 09/04/16 16:24, Jeff Newmiller wrote:
>   I highly
> recommend making friends with the str function. Try
>
>   str( 1 )
>   str( 1:2 )
>
>   Interesting.  But to me counter-intuitive.  Since R makes no distinction between scalars and vectors of length 1 (or more accurately I think, since in R there is *no such thing as a scalar*, only a vector of length
>   1) I don't see why "1" should be treated in a manner that is categorically different from the way in which "1:2" is treated.
>
>   Can you, or someone else with deep insight into R and its rationale, explain the basis for this difference in treatment?
>
>   for the clue you need, and then
>
>   sapply( 1:2, identical, 1L )
>
>   cheers,
>
>   Rolf
>
>   --
>   Technical Editor ANZJS
>   Department of Statistics
>   University of Auckland
>   Phone: +64-9-373-7599 ext. 88276
>
> ________________________________
>
>   R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>   https://stat.ethz.ch/mailman/listinfo/r-help
>   PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>   and provide commented, minimal, self-contained, reproducible code.
>
>       [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-help mailing list