[R] when to use `which'?

William Dunlap wdunlap at tibco.com
Wed Jul 13 17:49:45 CEST 2011

x[which(condition)], like the subset function, treats NAs in
condition as FALSE and hence does not output NAs for them.
I was also surprised to see that it runs a trifle faster than x[condition]
in R 2.13.0 if there are few TRUEs in condition and a trifle slower
if there are many TRUEs.

A danger of the x[which(condition)] approach is the case
where you are trying to omit some entries by using a negative
integer subscript, as in
That is equivalent to
if there are any NAs in x but if there are no NAs in x then
its output is a zero-length vector.

For complicated conditions I find it easier understand code
using logical operators
    x[!is.na(x) & x>0 & x<10]
than code using set operators using the output of which
   x[intersect( setdiff( which(x>0), which(is.na(x))), which(x<10))]

Bill Dunlap
TIBCO Spotfire

From: r-help-bounces at r-project.org [r-help-bounces at r-project.org] on behalf of csrabak [crabak at acm.org]
Sent: Wednesday, July 13, 2011 6:20 AM
To: r-help at stat.math.ethz.ch
Subject: Re: [R] when to use `which'?

Em 12/7/2011 17:29, David Winsemius escreveu:

> If you have millions of records and tens of thousands of NA's (say ~ 1%
> of the data), imagine what your console looks like if you try to pick
> out records from one day and get 10,000 where you were expecting 100. A
> real PITA when you are doing real work.

I canvas this snippet of experience and wisdom to become a fortune :-)

Cesar Rabak

R-help at r-project.org mailing list
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

More information about the R-help mailing list