[Rd] invert argument in grep
Gabor Grothendieck
ggrothendieck at gmail.com
Sun Nov 12 17:02:15 CET 2006
invert= would be consistent with the fact that egrep (-v), sed/vi (v) and
awk (~!) all have special facilities as indicated to handle such
negation/inversion.
On 11/12/06, Romain Francois <rfrancois at mango-solutions.com> wrote:
> Duncan Murdoch wrote:
> > On 11/10/2006 12:52 PM, Romain Francois wrote:
> >> Duncan Murdoch wrote:
> >>> On 11/9/2006 5:14 AM, Romain Francois wrote:
> >>>> Hello,
> >>>>
> >>>> What about an `invert` argument in grep, to return elements that
> >>>> are *not* matching a regular expression :
> >>>>
> >>>> R> grep("pink", colors(), invert = TRUE, value = TRUE)
> >>>>
> >>>> would essentially return the same as :
> >>>>
> >>>> R> colors() [ - grep("pink", colors()) ]
> >>>>
> >>>>
> >>>> I'm attaching the files that I modified (against today's tarball)
> >>>> for that purpose.
> >>>
> >>> I think a more generally useful change would be to be able to return
> >>> a logical vector with TRUE for a match and FALSE for a non-match, so
> >>> a simple !grep(...) does the inversion. (This is motivated by the
> >>> recent R-help discussion of the fact that x[-selection] doesn't
> >>> always invert the selection when it's a vector of indices.)
> >>>
> >>> A way to do that without expanding the argument list would be to allow
> >>>
> >>> value="logical"
> >>>
> >>> as well as value=TRUE and value=FALSE.
> >>>
> >>> This would make boolean operations easy, e.g.
> >>>
> >>> colors()[grep("dark", colors(), value="logical")
> >>> & !grep("blue", colors(), value="logical")]
> >>>
> >>> to select the colors that contain "dark" but not "blue". (In this
> >>> case the RE to select that subset is rather simple because "dark"
> >>> always precedes "blue", but if that wasn't true, it would be a lot
> >>> messier.)
> >>>
> >>> Duncan Murdoch
> >> Hi,
> >>
> >> It sounds like a nice thing to have. I would still prefer to type :
> >>
> >> R> grep ( "dark", grep("blue", colors(), value = TRUE, invert=TRUE),
> >> value = TRUE )
> >
> > That's good for intersecting two searches, but not for other boolean
> > combinations.
> >
> > My main point was that inversion isn't the only boolean operation you
> > may want, but R has perfectly good powerful boolean operators, so
> > installing a limited subset of boolean algebra into grep() is probably
> > the wrong approach.
>
> Hi,
>
> Yes, good point. I agree with you that the value = "logical" is probably
> worth having to take advantage of these logical operators.
>
> .... but, what about all these functions calling grep and passing
> arguments through the ellipsis. With this invert argument, we could do :
>
> R> history(pattern = "grid\\..*\\(", invert = TRUE)
>
> BTW, why not use ... in ls ? in case someone would like to use perl
> regex to use ls, or to get back at this thread, issue commands like :
>
> R> ls("package:grid", pattern = "^grid\\.|Grob$", invert = TRUE)
> [1] "absolute.size" "applyEdit" "applyEdits"
> [4] "arcCurvature" "arrow" "childNames"
> [7] "convertHeight" "convertNative" "convertUnit"
> [10] "convertWidth" "convertX" "convertY"
> [13] "current.transform" "current.viewport" "current.vpPath"
> [16] "current.vpTree" "dataViewport" "downViewport"
> [19] "draw.details" "drawDetails" "editDetails"
> [22] "engine.display.list" "gEdit" "gEditList"
> [25] "get.gpar" "getNames" "gList"
> [28] "gpar" "gPath" "grob"
> [31] "grobHeight" "grobName" "grobWidth"
> [34] "grobX" "grobY" "gTree"
> [37] "heightDetails" "is.unit" "layout.heights"
> [40] "layoutRegion" "layout.torture" "layout.widths"
> [43] "plotViewport" "pop.viewport" "popViewport"
> [46] "postDrawDetails" "preDrawDetails" "push.viewport"
> [49] "pushViewport" "seekViewport" "setChildren"
> [52] "stringHeight" "stringWidth" "unit"
> [55] "unit.c" "unit.length" "unit.pmax"
> [58] "unit.pmin" "unit.rep" "upViewport"
> [61] "validDetails" "viewport" "viewport.layout"
> [64] "viewport.transform" "vpList" "vpPath"
> [67] "vpStack" "vpTree" "widthDetails"
> [70] "xDetails" "yDetails"
>
> Then, what about ... in apropos ?
>
> Regards,
>
> Romain
>
>
> >>
> >>
> >> What about a way to pass more than one regular expression then be
> >> able to call :
> >>
> >> R> grep( c("dark", "blue"), colors(), value = TRUE, invert = c(TRUE,
> >> FALSE)
> >
> > Again, it covers & and !, but it misses other boolean operators.
> >
> >> I usually use that kind of shortcuts that are easy to remember.
> >>
> >> vgrep <- function(...) grep(..., value = TRUE)
> >> igrep <- function(...) grep(..., invert = TRUE)
> >> ivgrep <- vigrep <- function(...) grep(..., invert = TRUE, value = TRUE)
> >
> > If you're willing to write these, then it's easy to write igrep
> > without an invert arg to grep:
> >
> > igrep <- function(pat, x, ...)
> > setdiff(1:length(x), grep(pat, x, value = FALSE, ...))
> >
> > ivgrep would also be easy, except for the weird semantics of
> > value=TRUE pointed out by Brian: but it could still be written with a
> > little bit of care.
> >
> > Duncan Murdoch
> >
> >>
> >> What about things like the arguments `after` and `before` in unix
> >> grep. That could be used when grepping inside a function :
> >>
> >> R> grep("plot\\.", body(plot.default) , value= TRUE)
> >> [1] "localWindow <- function(..., col, bg, pch, cex, lty, lwd)
> >> plot.window(...)"
> >> [2] "plot.new()"
> >> [3] "plot.xy(xy, type, ...)"
> >>
> >>
> >> when this could be useful (possibly).
> >>
> >> R> # grep("plot\\.", plot.default, after = 2, value = TRUE)
> >> R> tmp <- tempfile(); sink(tmp) ; print(body(plot.default)); sink();
> >> system( paste( "grep -A2 plot\\. ", tmp) )
> >> localWindow <- function(..., col, bg, pch, cex, lty, lwd)
> >> plot.window(...)
> >> localTitle <- function(..., col, bg, pch, cex, lty, lwd) title(...)
> >> xlabel <- if (!missing(x))
> >> --
> >> plot.new()
> >> localWindow(xlim, ylim, log, asp, ...)
> >> panel.first
> >> plot.xy(xy, type, ...)
> >> panel.last
> >> if (axes) {
> >> --
> >> if (frame.plot)
> >> localBox(...)
> >> if (ann)
> >>
> >>
> >> BTW, if I call :
> >>
> >> R> grep("plot\\.", plot.default)
> >> Error in as.character(x) : cannot coerce to vector
> >>
> >> What about adding that line at the beginning of grep, or something
> >> else to be able to do as.character on a function ?
> >>
> >> if(is.function(x)) x <- body(x)
> >>
> >>
> >> Cheers,
> >>
> >> Romain
> >>>>
> >>>> Cheers,
> >>>>
> >>>> Romain
> >>
> >>
> >
> >
>
>
> --
> *mangosolutions*
> /data analysis that delivers/
>
> Tel +44 1249 467 467
> Fax +44 1249 467 468
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
More information about the R-devel
mailing list