[Rd] Versions of PCRE, documenting what grep etc do.
Prof Brian Ripley
ripley at stats.ox.ac.uk
Sat Oct 25 10:57:39 MEST 2003
I have added a preliminary help page for regex to R-patched which should
help for now, and added a configure test for PCRE >= 4.0 to R-devel.
I will return to this later in 2003.
Brian
On Fri, 24 Oct 2003, Kurt Hornik wrote:
> >>>>> Prof Brian Ripley writes:
>
> > A couple of weeks back there was some discussion about documenting the
> > regular expressions as used in R. Several years ago the problem was
> > that this was OS-dependent, and to plug that problem we incorporated
> > regexp code from a version of GNU grep, later updated to grep-2.4.2 in
> > R 1.2.0.
>
> > I have been looking at documenting what grep(perl=TRUE) does, and we
> > have a similar problem in that the current PCRE, 4.4, implements
> > rather more of Perl's regexps than 3.9 (which is in 1.8.0 if the OS
> > does not supply it, and RH8.0 has PCRE 3.9. Whichever version of
> > Debian is on franz has PCRE 3.4).
>
> > I could add a configure check for PCRE >= 4.0, and I think probably
> > should do that. However, my inclination is to always use the version
> > of PCRE in the R sources and thereby ensure that all builds of R have
> > the same version, the one I will document. Comments, please.
>
> I think we should in any case allow maintainers of binary packages on
> platforms with advanced package management systems to force the use of
> shared libraries the system can provide. (So the binary maintainers
> would need to verify that the system package provides the right libs and
> headers.)
>
> Not sure about the default: we typically try to use available system
> resources, unless this is bound to cause problems, and regex was of the
> latter type, afaicr.
>
> > For PCRE 4.4 there is a long man page that I will use as a basis for
> > the documentation. I am inclined just to include either a text or PDF
> > version of the man page -- any preferences for which form?
>
> Depends on where you would put the docs, I think. Btw, where can 4.4 be
> found?
>
> > For the non-Perl regexps it is harder, as I am unsure exactly what
> > patterns the GNU regex we have accepts. (From a problem which
> > occurred with some Sweave regexps, I think it accepts more than it is
> > intended to.) One fairly good docu source is the GNU grep man page:
> > does anyone know a better one? I had thought of writing a regexp.Rd
> > help page to which grep.Rd could refer.
>
> That would be great. Linux has a regex(7) purported to be "taken from
> Henry Spencer's regex package", which might be used as a start. The old
> GNU regex .tar.gz has a texinfo file, but does not help for what we
> need, I think.
>
> [I recently looked for available regexp docs, but was not too
> successful.]
>
> > None of this is imminent (I am too busy) but is intended for the next
> > minor release (which may be called 1.9.0 or 2.0.0, I gather).
>
> Too bad :-(
>
> Best
> -k
>
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel
mailing list