[R] NA treatment when comparing two vectors
Marc Schwartz
marc_schwartz at me.com
Tue Oct 4 20:37:33 CEST 2016
Hi,
A couple of comments:
1. Fabien, since you are transitioning from SAS, you may find the resources that Bob Muenchen has made available to be of value:
http://r4stats.com/ <http://r4stats.com/>
There is a free download (PDF) of an earlier version of his book available via his web site as well.
2. Presuming that Bert's approach satisfies your functional requirements in this case, you can encapsulate his code into a function that you might find easier to use moving forward. For example:
IsDiff <- function(a, b, tol = 1e-15) {
xor(is.na(a),is.na(b)) | (abs(b-a) > tol)
}
a <- c(1, 2, 3, NA, NA)
b <- c(1, 9, NA, 4 , NA)
> IsDiff(a, b)
[1] FALSE TRUE TRUE TRUE NA
Of course, you can call the function anything you wish, as long as it is a legal object name in R.
You could even create a new infix operator along the lines of the following, with the restriction that this approach can only take two arguments, so you would hard code the tolerance level in the function body:
"%ID%" <- function(a, b) {
xor(is.na(a),is.na(b)) | (abs(b-a) > 1e-15)
}
> a %ID% b
[1] FALSE TRUE TRUE TRUE NA
Bear in mind that a substantial portion of R is written in R itself, with a core set of functionality in C and FORTRAN where performance is materially enhanced by the use of compiled code. Thus, extending R's functionality, as Bert notes, is typically done by useRs encapsulating enhanced functionality in new R functions and the now thousands of packages on CRAN generally follow that same paradigm for specific applications.
Regards,
Marc Schwartz
> On Oct 4, 2016, at 12:33 PM, Bert Gunter <bgunter.4567 at gmail.com> wrote:
>
> Fabien:
>
> In general, R's philosophy as a programming language is that it should
> make it easy (and maybe efficient) to do the (data analysis) things
> you want to do, not necessarily provide all pre-packaged procedures
> (although with all the packages, it seems to come close to that!). So
> the following seems to fall into that paradigm (using your example a
> and b)
>
>> tol <- 1e-15
>
>> xor(is.na(a),is.na(b)) | (abs(b-a) > tol)
> [1] FALSE TRUE TRUE TRUE NA
>
>
> As you noted, defining equality of floating point numbers is a tricky
> business, so that you may prefer some other approach to that which I
> used. There may well be "pre-packaged" ways to do this, but I didn't
> look. You might try searching rseek.org for "defining numerical
> equality in R" or some such to see.
>
>
> Cheers,
> Bert
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Tue, Oct 4, 2016 at 9:51 AM, fabien verger <fabien.verger at gmail.com> wrote:
>> Hello,
>>
>> I want to get the differences when comparing 2 vectors, by pair (element by
>> element).
>> I'd like to get TRUEs when:
>> - the two compared elements are different and non-missing (like `!=` does)
>> - one element is missing and the other is not missing (unfortunatelly `!=`
>> gives NA and not TRUE)
>> Note that I don't want to get TRUEs when both are missing. NA or FALSE are
>> fine.
>>
>> Given a and b:
>>> a <- c(1, 2, 3, NA, NA)
>>> b <- c(1, 9, NA, 4 , NA)
>>
>> The only solution I found is:
>>
>>> a != b | (is.na(a) != is.na(b))
>> [1] FALSE TRUE TRUE TRUE NA
>>
>> Is there a single function which can do the same?
>> I searched for other comparison tools but found nothing relevant.
>>
>> And I would like also to avoid using `!=` because I'm often comparing
>> floating numbers computed by different algorithms (so rounded differently).
>>
>> I found identical() interesting (for exemple, !(identical(NA, 99)) gives
>> TRUE) but the result of !(identical(a, b) is a single logical, not a vector
>> of logicals.
>>
>> Many thanks in advance for your help.
>> P.S. I am new to R, coming from SAS. Actually, I'm looking for the R
>> function that replicates the SAS instruction: if a ^= b;
[[alternative HTML version deleted]]
More information about the R-help
mailing list