[Rd] grep and PCRE fun
Prof Brian Ripley
ripley at stats.ox.ac.uk
Fri Sep 30 17:12:29 CEST 2011
On Fri, 30 Sep 2011, Simon Urbanek wrote:
> Jeff,
>
> this is really a bug in PCRE since the length (0) is a multiple of 3 as documented so PCRE should not be writing anything. Anyway, this has been now fixed (by Brian).
Only in R-devel: R-2-13-branch is now closed (and was by the time I
read the message).
>
> Cheers,
> Simon
>
>
> On Sep 29, 2011, at 5:00 PM, Jeffrey Horner wrote:
>
>> Hello,
>>
>> I think I've found a bug in the C function do_grep located in
>> src/main/grep.c. It seems to affect both the latest revisions of
>> R-2-13-branch and trunk when compiling R without optimizations and
>> with it's own version of pcre located in src/extra, at least on ubuntu
>> 10.04.
>>
>> According to the pcre_exec API (I presume the later versions), the
>> ovecsize argument must be a multiple of 3 , and the ovector argument
>> must point to a location that can hold at least ovecsize integers. All
>> the pcre_exec calls made by do_grep, save one, honors this. That one
>> call seems to overwrite areas of the stack it shouldn't. Here's the
>> smallest example I found that tickles the bug:
>>
>>> grep("[^[:blank][:cntrl]]","\\n",perl=TRUE)
>> Error in grep("[^[:blank][:cntrl]]", "\\n", perl = TRUE) :
>> negative length vectors are not allowed
>>
>> As described above, this error occurs on ubuntu 10.04 when R is
>> compiled without optimizations ( I typically use CFLAGS="-ggdb"
>> CXXFLAGS="-ggdb" FFLAGS="-ggdb" ./configure --enable-R-shlib), and the
>> pcre_exec call executed from do_get overwrites the integer nmatches
>> and sets it to -1. This has the effect of making do_grep try and
>> allocate a results vector of length -1, which of course causes the
>> error message above.
>>
>> I'd be interested to know if this bug happens on other platforms.
>>
>> Below is my simple fix for R-2-13-branch (a similar fix works for
>> trunk as well).
>>
>> Jeff
>>
>> $ svn diff main/grep.c
>> Index: main/grep.c
>> ===================================================================
>> --- main/grep.c (revision 57110)
>> +++ main/grep.c (working copy)
>> @@ -723,7 +723,7 @@
>> {
>> SEXP pat, text, ind, ans;
>> regex_t reg;
>> - int i, j, n, nmatches = 0, ov, rc;
>> + int i, j, n, nmatches = 0, ov[3], rc;
>> int igcase_opt, value_opt, perl_opt, fixed_opt, useBytes, invert;
>> const char *spat = NULL;
>> pcre *re_pcre = NULL /* -Wall */;
>> @@ -882,7 +882,7 @@
>> if (fixed_opt)
>> LOGICAL(ind)[i] = fgrep_one(spat, s, useBytes, use_UTF8, NULL) >= 0;
>> else if (perl_opt) {
>> - if (pcre_exec(re_pcre, re_pe, s, strlen(s), 0, 0, &ov, 0) >= 0)
>> + if (pcre_exec(re_pcre, re_pe, s, strlen(s), 0, 0, ov, 3) >= 0)
>> INTEGER(ind)[i] = 1;
>> } else {
>> if (!use_WC)
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel
mailing list