[Rd] invert argument in grep
Duncan Murdoch
murdoch at stats.uwo.ca
Fri Nov 10 19:24:05 CET 2006
On 11/10/2006 12:52 PM, Romain Francois wrote:
> Duncan Murdoch wrote:
>> On 11/9/2006 5:14 AM, Romain Francois wrote:
>>> Hello,
>>>
>>> What about an `invert` argument in grep, to return elements that are
>>> *not* matching a regular expression :
>>>
>>> R> grep("pink", colors(), invert = TRUE, value = TRUE)
>>>
>>> would essentially return the same as :
>>>
>>> R> colors() [ - grep("pink", colors()) ]
>>>
>>>
>>> I'm attaching the files that I modified (against today's tarball) for
>>> that purpose.
>>
>> I think a more generally useful change would be to be able to return a
>> logical vector with TRUE for a match and FALSE for a non-match, so a
>> simple !grep(...) does the inversion. (This is motivated by the
>> recent R-help discussion of the fact that x[-selection] doesn't always
>> invert the selection when it's a vector of indices.)
>>
>> A way to do that without expanding the argument list would be to allow
>>
>> value="logical"
>>
>> as well as value=TRUE and value=FALSE.
>>
>> This would make boolean operations easy, e.g.
>>
>> colors()[grep("dark", colors(), value="logical")
>> & !grep("blue", colors(), value="logical")]
>>
>> to select the colors that contain "dark" but not "blue". (In this case
>> the RE to select that subset is rather simple because "dark" always
>> precedes "blue", but if that wasn't true, it would be a lot messier.)
>>
>> Duncan Murdoch
> Hi,
>
> It sounds like a nice thing to have. I would still prefer to type :
>
> R> grep ( "dark", grep("blue", colors(), value = TRUE, invert=TRUE),
> value = TRUE )
That's good for intersecting two searches, but not for other boolean
combinations.
My main point was that inversion isn't the only boolean operation you
may want, but R has perfectly good powerful boolean operators, so
installing a limited subset of boolean algebra into grep() is probably
the wrong approach.
>
>
> What about a way to pass more than one regular expression then be able
> to call :
>
> R> grep( c("dark", "blue"), colors(), value = TRUE, invert = c(TRUE, FALSE)
Again, it covers & and !, but it misses other boolean operators.
> I usually use that kind of shortcuts that are easy to remember.
>
> vgrep <- function(...) grep(..., value = TRUE)
> igrep <- function(...) grep(..., invert = TRUE)
> ivgrep <- vigrep <- function(...) grep(..., invert = TRUE, value = TRUE)
If you're willing to write these, then it's easy to write igrep without
an invert arg to grep:
igrep <- function(pat, x, ...)
setdiff(1:length(x), grep(pat, x, value = FALSE, ...))
ivgrep would also be easy, except for the weird semantics of value=TRUE
pointed out by Brian: but it could still be written with a little bit
of care.
Duncan Murdoch
>
> What about things like the arguments `after` and `before` in unix grep.
> That could be used when grepping inside a function :
>
> R> grep("plot\\.", body(plot.default) , value= TRUE)
> [1] "localWindow <- function(..., col, bg, pch, cex, lty, lwd)
> plot.window(...)"
> [2] "plot.new()"
> [3] "plot.xy(xy, type, ...)"
>
>
> when this could be useful (possibly).
>
> R> # grep("plot\\.", plot.default, after = 2, value = TRUE)
> R> tmp <- tempfile(); sink(tmp) ; print(body(plot.default)); sink();
> system( paste( "grep -A2 plot\\. ", tmp) )
> localWindow <- function(..., col, bg, pch, cex, lty, lwd)
> plot.window(...)
> localTitle <- function(..., col, bg, pch, cex, lty, lwd) title(...)
> xlabel <- if (!missing(x))
> --
> plot.new()
> localWindow(xlim, ylim, log, asp, ...)
> panel.first
> plot.xy(xy, type, ...)
> panel.last
> if (axes) {
> --
> if (frame.plot)
> localBox(...)
> if (ann)
>
>
> BTW, if I call :
>
> R> grep("plot\\.", plot.default)
> Error in as.character(x) : cannot coerce to vector
>
> What about adding that line at the beginning of grep, or something else
> to be able to do as.character on a function ?
>
> if(is.function(x)) x <- body(x)
>
>
> Cheers,
>
> Romain
>>>
>>> Cheers,
>>>
>>> Romain
>
>
More information about the R-devel
mailing list