which() does not handle NAs in named vectors. (PR#226)
Martin Maechler
Martin Maechler <maechler@stat.math.ethz.ch>
Thu, 15 Jul 1999 14:57:22 +0200
>>>>> On Thu, 15 Jul 1999 09:14, ripley@stats.ox.ac.uk (Brian D. Ripley) said:
Thank you for the bug report
BDR> -- It is unclear to me that the handling of NAs is desirable, and
BDR> it has problems with names:
{function which in its present form very much evolved out of user wishes...}
BDR> z <- c(T,T,NA,F,T)
BDR> names(z) <- letters[1:5]
BDR> which(z)
BDR> Error: names attribute must be the same length as the vector
fixed for release-patches [available in a day or two from CRAN src/devel/]
and hence every new release.
BDR> (Why do the vector and its names have different subscripts? And
BDR> while you are correcting this,
BDR> Arguments:
BDR> x: a logical vector or array. `NA's are allowed an
BDR> omitted.
is now
x: a `logical' vector or array. `NA's are allowed
and omitted (treated as if `FALSE').
BDR> has a typo, and the logic can be simplified: see below.)
BDR> On Thu, 15 Jul 1999, Martin Maechler wrote:
>> >>>>> "BDR" == Prof Brian D Ripley <ripley@stats.ox.ac.uk> writes:
>>
BDR> On Wed, 14 Jul 1999, Friedrich Leisch wrote:
>> >> >>>>> On Wed, 14 Jul 1999 04:09:21, >>>>> Peter B Mandeville
>> (PBM) >> wrote:
>> >>
PBM> I have a vector Pes with 600 elements some of which are NA's. How
PBM> can I form a vector of the indices of the NA's.
>> >>
PBM> for(i in 1:600) if(is.na(Pes[i])) print(i)
>> >>
PBM> prints the indices of the NA's but I can't figure out how to put
PBM> the results in a vector.
>> >> try this:
>> >>
>> >> x <- (1:length(Pes))[is.na(Pes)]
>>
BDR> Tip: that sort of thing often fails for a length 0 vector. The
BDR> `approved' spell is
>>
BDR> seq(along=Pes)[is.na(Pes)]
BTW, currently seq(along = x) returns "numeric" ("double")
whereas 1:length(x) returns "integer".
I'm about to fix this...
BDR> In this case it does not matter as the subscript is of length 0,
BDR> but it has floored enough library/package writers to be worth
BDR> thinking about.
>> Good teaching about seq() vs. 1:n
>>
>> However, the solution I gave
>>
>> which(is.na(Pes))
>>
>> is the one I stilly really recommend; it does deal with 0-length
>> objects, and it keeps names when there are some, and it has an
>> `arr.ind = FALSE' argument to return array indices instead of vector
>> indices when so desired.
BDR> Yes, but
BDR> -- It is not in S (so causing difficulty in porting from R to S)
Well, I know what you mean and your point is all well in the above case...
but anyway:
Our group here has been using this ("which" function) in S for quite a while and
eventually, someone will have to collect a library of things from R, missing in
S-plus and easily implementable.
And then, for quite a few R users, S-plus backward compatibility is not the
big issue. Locally, in our collection of S-plus add-ons, we've got already
quite a few of them..
And in other ways, R is so much nicer
- math annotation in graphics
- color, line types { plot(x,y, col="light blue", col.main = "blue") }
- filled.contour
- persp() with shading..
I think if you want to live in both worlds, I want (and recommend) to use
if(is.R()) {
...R specific...
}
else { ## S-plus ---
...S-plus specific...
}
anyway, even within user written functions
and make sure (via .First or S_FIRST or ...) that is.R() |--> FALSE in S-plus
BDR> -- It looks a relatively expensive operation.
I don't think it is expensive (for arr.ind=FALSE !) if you want to do deal
with missings (NA) at all. (Peter's example above is one of the few places
where you are absolutely sure there are no missings...)
Assume x has some NAs, e.g.
x <- rnorm(1000); x[1000*runif(rpois(1,lam=50))] <- NA
Then
which( x < -2 )
works how one would want;
seq(along = x)[x < -2]
gives silly NA's (which make sense for the logical vector but not for the
extraction).
BDR> -- Internally which could be simplified by using seq(along=) as it is a wrapper for
BDR> this construct, but actually the separate handling of n == 0 is
BDR> unnecessary (as logic & !is.na(logic) will have length zero.)
You are right, and that's part of the fix for `which' which is currently
which <- function(logic, arr.ind = FALSE)
{
if(!is.logical(logic))
stop("argument to \"which\" is not logical")
wh <- seq(along=logic)[ll <- logic & !is.na(logic)]
if ((m <- length(wh)) > 0) {
dl <- dim(logic)
if (is.null(dl) || !arr.ind) {
names(wh) <- names(logic)[ll]
}
else { ##-- return a matrix length(wh) x rank
rank <- length(dl)
wh1 <- wh - 1
wh <- 1 + wh1 %% dl[1]
wh <- matrix(wh, nrow = m, ncol = rank,
dimnames =
list(dimnames(logic)[[1]][wh],
if(rank == 2) c("row", "col")# for matrices
else paste("dim", 1:rank, sep="")))
if(rank >= 2) {
denom <- 1
for (i in 2:rank) {
denom <- denom * dl[i-1]
nextd1 <- wh1 %/% denom# (next dim of elements) - 1
wh[,i] <- 1 + nextd1 %% dl[i]
}
}
storage.mode(wh) <- "integer"
}
}
wh
}
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._