[R] regex -> negate a word
Wacek Kusnierczyk
Waclaw.Marcin.Kusnierczyk at idi.ntnu.no
Mon Jan 19 10:28:23 CET 2009
Stavros Macrakis wrote:
> On Sun, Jan 18, 2009 at 2:22 PM, Wacek Kusnierczyk
> <Waclaw.Marcin.Kusnierczyk at idi.ntnu.no> wrote:
>
>
>> x[-grep("abc", x)]
>> which unfortunately fails if none of the strings in x matches the pattern, i.e., grep returns integer(0);
>>
>
> Yes.
>
>
>> arguably, x[integer(0)] should rather return all elements of x
>>
>
> The meaning of x[V] (for an integer subscript vector V) is:
what about numeric vectors? r performs smart downcasting here:
x[1.1]
# same as x[1]
x[0.3]
# character(0)
> ignore 0
> entries, and then:
>
what if V=NULL?
> a) if !(all(V>0) | all(V<0) ) => ERROR
>
there is no error for x[v] with V=0, V=as.numeric(NA), or V=NaN.
> b) if all (V>0): length(x[V]) == length(V)
>
unfortunately, false if v contains a non-integer (so it goes beyond your
discussion, but may cause problems in practice):
x[c(1, 0.5)]
# one item (if x is non-empty)
> c) if all (V<0): length(x[V]) == length(x)-length(unique(V))
>
not true for cases like V=c(-1, -1.5), which again go beyond your
discussion, but may happen in practice.
interestingly, unique(c(NA, NA)) is just NA, rather than c(NA,NA). i'd
think that if we have two non-available values, we can't be sure they're
in fact equal, but unique apparently is. (you'd have to tell it not to
be with incomparables=NA.)
> When length(V)==0, the preconditions are true for both (b) and (c), so
>
interestingly, all(V>0) & all(V<0) is TRUE for V=c().
> the R design has made the decision that length(x[V]) == 0 in this
> case. If you're going to have the "negative indices means exclusion"
> trick, this seems like a reasonable convention.
>
i didn't say this was unreasonable, just that x[integer(0)] should,
arguably, return x. 'empty index' is not as precise an expression to be
sure that it will be obvious to everyone that integer(0) is *not* an
empty index, and less so with NULL. what is meant, i guess, is 'empty
index expression', i.e., no index rather than empty index, and i'd
humbly suggest (risking being charged with boring pedantry) to improve tfm.
vQ
More information about the R-help
mailing list