[Rd] Regular expressions & large strings (PR#6617)
Mark White
mjw at celos.net
Sat Feb 28 16:14:18 MET 2004
Prof Brian Ripley writes:
> I was able to confirm the error on RH8.0 Linux and the segfault on
> Windows.
>
> Note that PCRE is not being used, and if you add perl=TRUE to your [g]sub
> calls you get correct results extremely fast.
Thanks for clarifying that; I hadn't realised.
> The segfault is occurring in regexec, that is in the GNU regex code
> included in R. I am not clear it is worth spending any time on trying to
> find the problem in that code as
>
> - you can use perl=TRUE as an alternative
> - we will be replacing the GNU regex code in due course to cope with
> internationalization issues.
Sounds fine. Do you think either of the following are worth
doing in the meantime?
- Add an strsplit() variant with PCRE (perhaps this
problem is be related to PR#6601; and the speed might be
nice anyway).
- Add options(pcre) so the potentially bad code can be
avoided without explicitly setting perl=TRUE every time.
Mark <><
More information about the R-devel
mailing list