[R] ? Exact pattern matching in GREP ?

Thomas Lumley tlumley at u.washington.edu
Fri Sep 27 19:19:41 CEST 2002


On Fri, 27 Sep 2002, Derek Eder wrote:

> How is exact pattern matching achieved in GREP (and GREPlike) functions ?

By choosing the right pattern :)

Regular expressions are designed to match substrings of a string, and can
do very complicated matching.  They have a different syntax from filename
wildcard expressions like *.lm, and don't automatically tie down the
beginning or end of a string.  Google for 'regular expressions' to find
some helpful descriptions.


> # Want: listing of all object names that end in *.lm
> > objects(pattern="*.lm",pos=1)
> #  ...  but get:  all objects that partially match *.lm, e.g., *.lme
> [1] "j3.lm"  "J3.lme"  "j8.lm"  "J8.lme"

The regular expression *.lm matches any string containing at least one
character followed by the "lm", so it matches all of your examples and
would also match "almost" and "calmly" but would not match, for example,
"lm" or "lme".

You appear to want a string ending in ".lm", for which the pattern is
"\\.lm$".  The "\\." matches a "." (where a "." pattern matches any
character), and the "$" matches the end of line.


>
> #  Want:  position of string "4jan2002" in vector
> > date.index <- grep("4jan2002", my.dates)
> # .... but get:
> > my.dates[date.index]
> [1]  "4jan2002"  "24jan2002"   "14jan2002"
>

If you want exact string equality the best approach is to use == rather
than grep, eg
  which(my.dates == "4jan2002")

It's possible to do this using grep(), but it's harder -- you are looking
for a string that starts "4jan2002" and then ends. The regular expression
for this is
    "^4jan2002$"


Filename wildcards translate like this:

	*.lm	 "\\.lm$"
     test.*	 "^test\\."
     test?.lm    "^test.\\.lm$"
     test.lm*	 "^test\\.lm"
     *.*	 "\\."


	-thomas

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list