[Rd] Versions of PCRE, documenting what grep etc do.
Prof Brian Ripley
ripley at stats.ox.ac.uk
Fri Oct 24 08:46:41 MEST 2003
A couple of weeks back there was some discussion about documenting the
regular expressions as used in R. Several years ago the problem was that
this was OS-dependent, and to plug that problem we incorporated regexp
code from a version of GNU grep, later updated to grep-2.4.2 in R 1.2.0.
I have been looking at documenting what grep(perl=TRUE) does, and we
have a similar problem in that the current PCRE, 4.4, implements rather
more of Perl's regexps than 3.9 (which is in 1.8.0 if the OS does not
supply it, and RH8.0 has PCRE 3.9. Whichever version of Debian is on franz
has PCRE 3.4).
I could add a configure check for PCRE >= 4.0, and I think probably should
do that. However, my inclination is to always use the version of PCRE in
the R sources and thereby ensure that all builds of R have the same
version, the one I will document. Comments, please.
For PCRE 4.4 there is a long man page that I will use as a basis for the
documentation. I am inclined just to include either a text or PDF version
of the man page -- any preferences for which form?
For the non-Perl regexps it is harder, as I am unsure exactly what
patterns the GNU regex we have accepts. (From a problem which occurred
with some Sweave regexps, I think it accepts more than it is intended
to.) One fairly good docu source is the GNU grep man page: does anyone
know a better one? I had thought of writing a regexp.Rd help page to
which grep.Rd could refer.
None of this is imminent (I am too busy) but is intended for the next
minor release (which may be called 1.9.0 or 2.0.0, I gather).
Brian
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel
mailing list