[Rd] invert argument in grep

Gabor Grothendieck ggrothendieck at gmail.com
Sun Nov 12 17:02:15 CET 2006


invert= would be consistent with the fact that egrep (-v), sed/vi (v) and
awk (~!) all have special facilities as indicated to handle such
negation/inversion.


On 11/12/06, Romain Francois <rfrancois at mango-solutions.com> wrote:
> Duncan Murdoch wrote:
> > On 11/10/2006 12:52 PM, Romain Francois wrote:
> >> Duncan Murdoch wrote:
> >>> On 11/9/2006 5:14 AM, Romain Francois wrote:
> >>>> Hello,
> >>>>
> >>>> What about an `invert` argument in grep, to return elements that
> >>>> are *not* matching a regular expression :
> >>>>
> >>>> R> grep("pink", colors(), invert = TRUE, value = TRUE)
> >>>>
> >>>> would essentially return the same as :
> >>>>
> >>>> R> colors() [ - grep("pink", colors()) ]
> >>>>
> >>>>
> >>>> I'm attaching the files that I modified (against today's tarball)
> >>>> for that purpose.
> >>>
> >>> I think a more generally useful change would be to be able to return
> >>> a logical vector with TRUE for a match and FALSE for a non-match, so
> >>> a simple !grep(...) does the inversion.  (This is motivated by the
> >>> recent R-help discussion of the fact that x[-selection] doesn't
> >>> always invert the selection when it's a vector of indices.)
> >>>
> >>> A way to do that without expanding the argument list would be to allow
> >>>
> >>> value="logical"
> >>>
> >>> as well as value=TRUE and value=FALSE.
> >>>
> >>> This would make boolean operations easy, e.g.
> >>>
> >>> colors()[grep("dark", colors(), value="logical")
> >>>       & !grep("blue", colors(), value="logical")]
> >>>
> >>> to select the colors that contain "dark" but not "blue". (In this
> >>> case the RE to select that subset is rather simple because "dark"
> >>> always precedes "blue", but if that wasn't true, it would be a lot
> >>> messier.)
> >>>
> >>> Duncan Murdoch
> >> Hi,
> >>
> >> It sounds like a nice thing to have. I would still prefer to type :
> >>
> >> R> grep ( "dark", grep("blue", colors(), value = TRUE, invert=TRUE),
> >> value = TRUE )
> >
> > That's good for intersecting two searches, but not for other boolean
> > combinations.
> >
> > My main point was that inversion isn't the only boolean operation you
> > may want, but R has perfectly good powerful boolean operators, so
> > installing a limited subset of boolean algebra into grep() is probably
> > the wrong approach.
>
> Hi,
>
> Yes, good point. I agree with you that the value = "logical" is probably
> worth having to take advantage of these logical operators.
>
> .... but, what about all these functions calling grep and passing
> arguments through the ellipsis. With this invert argument, we could do :
>
> R> history(pattern = "grid\\..*\\(", invert = TRUE)
>
> BTW, why not use ... in ls ? in case someone would like to use perl
> regex to use ls, or to get back at this thread, issue commands like :
>
> R> ls("package:grid", pattern = "^grid\\.|Grob$", invert = TRUE)
>  [1] "absolute.size"       "applyEdit"           "applyEdits"
>  [4] "arcCurvature"        "arrow"               "childNames"
>  [7] "convertHeight"       "convertNative"       "convertUnit"
> [10] "convertWidth"        "convertX"            "convertY"
> [13] "current.transform"   "current.viewport"    "current.vpPath"
> [16] "current.vpTree"      "dataViewport"        "downViewport"
> [19] "draw.details"        "drawDetails"         "editDetails"
> [22] "engine.display.list" "gEdit"               "gEditList"
> [25] "get.gpar"            "getNames"            "gList"
> [28] "gpar"                "gPath"               "grob"
> [31] "grobHeight"          "grobName"            "grobWidth"
> [34] "grobX"               "grobY"               "gTree"
> [37] "heightDetails"       "is.unit"             "layout.heights"
> [40] "layoutRegion"        "layout.torture"      "layout.widths"
> [43] "plotViewport"        "pop.viewport"        "popViewport"
> [46] "postDrawDetails"     "preDrawDetails"      "push.viewport"
> [49] "pushViewport"        "seekViewport"        "setChildren"
> [52] "stringHeight"        "stringWidth"         "unit"
> [55] "unit.c"              "unit.length"         "unit.pmax"
> [58] "unit.pmin"           "unit.rep"            "upViewport"
> [61] "validDetails"        "viewport"            "viewport.layout"
> [64] "viewport.transform"  "vpList"              "vpPath"
> [67] "vpStack"             "vpTree"              "widthDetails"
> [70] "xDetails"            "yDetails"
>
> Then, what about ... in apropos ?
>
> Regards,
>
> Romain
>
>
> >>
> >>
> >> What about a way to pass more than one regular expression then be
> >> able to call :
> >>
> >> R> grep( c("dark", "blue"), colors(), value = TRUE, invert = c(TRUE,
> >> FALSE)
> >
> > Again, it covers & and !, but it misses other boolean operators.
> >
> >> I usually use that kind of shortcuts that are easy to remember.
> >>
> >> vgrep <- function(...) grep(..., value = TRUE)
> >> igrep <- function(...) grep(..., invert = TRUE)
> >> ivgrep <- vigrep <- function(...) grep(..., invert = TRUE, value = TRUE)
> >
> > If you're willing to write these, then it's easy to write igrep
> > without an invert arg to grep:
> >
> > igrep <- function(pat, x, ...)
> >    setdiff(1:length(x), grep(pat, x, value = FALSE, ...))
> >
> > ivgrep would also be easy, except for the weird semantics of
> > value=TRUE pointed out by Brian:  but it could still be written with a
> > little bit of care.
> >
> > Duncan Murdoch
> >
> >>
> >> What about things like the arguments `after` and `before` in unix
> >> grep. That could be used when grepping inside a function :
> >>
> >> R> grep("plot\\.", body(plot.default) , value= TRUE)
> >> [1] "localWindow <- function(..., col, bg, pch, cex, lty, lwd)
> >> plot.window(...)"
> >> [2] "plot.new()"
> >> [3] "plot.xy(xy, type, ...)"
> >>
> >>
> >> when this could be useful  (possibly).
> >>
> >> R> # grep("plot\\.", plot.default, after = 2, value = TRUE)
> >> R> tmp <- tempfile(); sink(tmp) ; print(body(plot.default)); sink();
> >> system( paste( "grep -A2 plot\\. ", tmp) )
> >>     localWindow <- function(..., col, bg, pch, cex, lty, lwd)
> >> plot.window(...)
> >>     localTitle <- function(..., col, bg, pch, cex, lty, lwd) title(...)
> >>     xlabel <- if (!missing(x))
> >> --
> >>     plot.new()
> >>     localWindow(xlim, ylim, log, asp, ...)
> >>     panel.first
> >>     plot.xy(xy, type, ...)
> >>     panel.last
> >>     if (axes) {
> >> --
> >>     if (frame.plot)
> >>         localBox(...)
> >>     if (ann)
> >>
> >>
> >> BTW, if I call :
> >>
> >> R> grep("plot\\.", plot.default)
> >> Error in as.character(x) : cannot coerce to vector
> >>
> >> What about adding that line at the beginning of grep, or something
> >> else to be able to do as.character on a function ?
> >>
> >> if(is.function(x)) x <- body(x)
> >>
> >>
> >> Cheers,
> >>
> >> Romain
> >>>>
> >>>> Cheers,
> >>>>
> >>>> Romain
> >>
> >>
> >
> >
>
>
> --
> *mangosolutions*
> /data analysis that delivers/
>
> Tel   +44 1249 467 467
> Fax   +44 1249 467 468
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



More information about the R-devel mailing list