[R] [FORGED] Re: identical() versus sapply()
Paulson, Ariel
apa at stowers.org
Tue Apr 12 04:06:12 CEST 2016
Hi Duncan,
That explains it, thanks!
I rarely use as(), but had thought in this case, replacing identical(x, y) with identical(x, as(y,class(x))) could be an sapply-friendly way to iron out class differences -- then noticed the inexplicable result. But now I know about all.equal().
Thanks,
Ariel
________________________________________
From: Duncan Murdoch <murdoch.duncan at gmail.com>
Sent: Monday, April 11, 2016 8:09 PM
To: Paulson, Ariel; Jeff Newmiller; Bert Gunter
Cc: r-help at r-project.org
Subject: Re: [R] [FORGED] Re: identical() versus sapply()
On 11/04/2016 8:25 PM, Paulson, Ariel wrote:
> Hi Jeff,
>
>
> We are splitting hairs because R is splitting hairs, and causing us problems. Integer and numeric are different R classes with different properties, mathematical relationships notwithstanding. For instance, the counterintuitive result:
The issue here is that R has grown. The as() function is newer than the
as.numeric() function, it's part of the methods package. It is a much
more complicated thing, and there are cases where they differ.
In this case, the problem is that is(1L, "numeric") evaluates to TRUE,
and nobody has written a coerce method that specifically converts
"integer" to "numeric". So the as() function defaults to doing nothing.
It takes a while to do nothing, approximately 360 times longer than
as.numeric() takes to actually do the conversion:
> microbenchmark(as.numeric(1L), as(1L, "numeric"))
Unit: nanoseconds
expr min lq mean median uq max neval
as.numeric(1L) 133 210 516.92 273.5 409.5 9444 100
as(1L, "numeric") 51464 64501 119294.31 99768.5 138321.0 1313669 100
R performance is not always simple and easy to predict, but I think
anyone who had experience with R would never use as(x, "numeric"). So
this just isn't a problem worth fixing.
Now, you might object that the documentation claims they are equivalent,
but it certainly doesn't. The documentation aims to be accurate, not
necessarily clear.
Duncan Murdoch
>
>> identical(as.integer(1), as.numeric(1))
> [1] FALSE
>
>
> Unfortunately the reply-to chain doesn't extend far enough -- here is the original problem:
>
>
>> sapply(1, identical, 1)
> [1] TRUE
>
>> sapply(1:2, identical, 1)
> [1] FALSE FALSE
>
>> sapply(1:2, function(i) identical(as.numeric(i),1) )
> [1] TRUE FALSE
>
>> sapply(1:2, function(i) identical(as(i,"numeric"),1) )
> [1] FALSE FALSE
>
> These are the results of R's hair-splitting!
>
> Ariel
>
> ________________________________
> From: Jeff Newmiller <jdnewmil at dcn.davis.ca.us>
> Sent: Monday, April 11, 2016 6:49 PM
> To: Bert Gunter; Paulson, Ariel
> Cc: Rolf Turner; r-help at r-project.org
> Subject: Re: [R] [FORGED] Re: identical() versus sapply()
>
> Hypothesis regarding the thought process: integer is a perfect subset of numeric, so why split hairs?
> --
> Sent from my phone. Please excuse my brevity.
>
> On April 11, 2016 12:36:56 PM PDT, Bert Gunter <bgunter.4567 at gmail.com> wrote:
>
> Indeed!
>
> Slightly simplified to emphasize your point:
>
> class(as(1:2,"numeric"))
> [1] "integer"
>
> class(as.numeric(1:2))
> [1] "numeric"
>
> whereas in ?as it says:
>
> "Methods are pre-defined for coercing any object to one of the basic
> datatypes. For example, as(x, "numeric") uses the existing as.numeric
> function. "
>
> I suspect this is related to my ignorance of S4 classes (i.e. as() )
> and how they relate to S3 classes, but I certainly don't get it
> either.
>
> Cheers,
> Bert
>
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things
> into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Mon, Apr 11, 2016 at 9:30 AM, Paulson, Ariel <apa at stowers.org> wrote:
> Ok, I see the difference between 1 and 1:2, I'll just leave it as one of those "only in R" things.
>
> But it seems then, that as.numeric() should guarantee a FALSE outcome, yet it does not.
>
> To build on what Rolf pointed out, I would really love for someone to explain this one:
>
> str(1)
> num 1
>
> str(1:2)
> int [1:2] 1 2
>
> str(as.numeric(1:2))
> num [1:2] 1 2
>
> str(as(1:2,"numeric"))
> int [1:2] 1 2
>
> Which doubly makes no sense. 1) Either the class is "numeric" or it isn't; I did not call as.integer() here. 2) method of recasting should not affect final class.
>
> Thanks,
> Ariel
>
>
> -----Original Message-----
> From: Rolf Turner [mailto:r.turner at auckland.ac.nz]
> Sent: Saturday, April 09, 2016 5:27 AM
> To: Jeff Newmiller
> Cc: Paulson, Ariel; 'r-help at r-project.org'
> Subject: Re: [FORGED] Re: [R] identical() versus sapply()
>
> On 09/04/16 16:24, Jeff Newmiller wrote:
> I highly
> recommend making friends with the str function. Try
>
> str( 1 )
> str( 1:2 )
>
> Interesting. But to me counter-intuitive. Since R makes no distinction between scalars and vectors of length 1 (or more accurately I think, since in R there is *no such thing as a scalar*, only a vector of length
> 1) I don't see why "1" should be treated in a manner that is categorically different from the way in which "1:2" is treated.
>
> Can you, or someone else with deep insight into R and its rationale, explain the basis for this difference in treatment?
>
> for the clue you need, and then
>
> sapply( 1:2, identical, 1L )
>
> cheers,
>
> Rolf
>
> --
> Technical Editor ANZJS
> Department of Statistics
> University of Auckland
> Phone: +64-9-373-7599 ext. 88276
>
> ________________________________
>
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list