[Rd] segfault with readDCF on R 3.1.2 on AIX 6.1 when using install.packages
Hervé Pagès
hpages at fredhutch.org
Mon Sep 21 23:36:36 CEST 2015
On 09/21/2015 01:50 PM, Hervé Pagès wrote:
> Hi,
>
> Note that one significant change to read.dcf() that happened since R
> 3.0.2 is the addition of support for arbitrary long lines (commit
> 63281), which never worked:
>
> dcf <- paste(c("aa: ", rep(letters, length.out=10000)), collapse="")
> writeLines(dcf, "test.dcf")
> nchar(read.dcf("test.dcf"))
> # aa
> # [1,] 8186
>
> The culprit being line 53 in src/main/dcf.c where the author of the
> Rconn_getline2() function only copies 'nbuf' chars from 'buf' to 'buf2'
> when in fact 'nbuf + 1' chars have been stored in 'buf' so far.
>
> Quickest fix:
>
> Index: src/main/dcf.c
> ===================================================================
> --- src/main/dcf.c (revision 69404)
> +++ src/main/dcf.c (working copy)
> @@ -50,7 +50,7 @@
> if(nbuf+2 >= bufsize) { // allow for terminator below
> bufsize *= 2;
> char *buf2 = R_alloc(bufsize, sizeof(char));
> - memcpy(buf2, buf, nbuf);
> + memcpy(buf2, buf, nbuf + 1);
> buf = buf2;
> }
> if(c != '\n'){
>
> However a better fix would be to have 'nbuf' actually contain the nb
> of chars that was stored in 'buf' so far (as it name suggests):
>
> Index: src/main/dcf.c
> ===================================================================
> --- src/main/dcf.c (revision 69404)
> +++ src/main/dcf.c (working copy)
> @@ -42,12 +42,12 @@
> /* Use R_alloc as this might get interrupted */
> static char *Rconn_getline2(Rconnection con)
> {
> - int c, bufsize = MAXELTSIZE, nbuf = -1;
> + int c, bufsize = MAXELTSIZE, nbuf = 0;
> char *buf;
>
> buf = R_alloc(bufsize, sizeof(char));
> while((c = Rconn_fgetc(con)) != R_EOF) {
> - if(nbuf+2 >= bufsize) { // allow for terminator below
> + if(nbuf+1 >= bufsize) { // allow for terminator below
> bufsize *= 2;
> char *buf2 = R_alloc(bufsize, sizeof(char));
> memcpy(buf2, buf, nbuf);
> @@ -54,17 +54,19 @@
> buf = buf2;
> }
> if(c != '\n'){
> - buf[++nbuf] = (char) c;
> + buf[nbuf++] = (char) c;
> } else {
> - buf[++nbuf] = '\0';
> + buf[nbuf++] = '\0';
> break;
> }
> }
> + if (nbuf == 0)
> + return NULL;
> /* Make sure it is null-terminated even if file did not end with
> * newline.
> */
> - if(nbuf >= 0 && buf[nbuf]) buf[++nbuf] = '\0';
> - return (nbuf == -1) ? NULL: buf;
> + buf[nbuf-1] = '\0';
^^^^^^
oops... need to be:
buf[nbuf] = '\0';
Cheers,
H.
> + return buf;
> }
>
> That improves readability and reduces the risk of bugs.
>
> Also note that Rconn_getline2() allocates a new buffer for each line in
> the DCF file. So we got support for arbitrary long lines (a rare
> situation) at the price of a slow down and increased memory usage for
> all DCF files. Sounds less than optimal :-/
>
> Cheers,
> H.
>
>
> On 09/21/2015 11:01 AM, Duncan Murdoch wrote:
>> On 21/09/2015 1:49 PM, Vinh Nguyen wrote:
>>> Here's an update:
>>>
>>> I checked the ChangeLog for R, and it seems like readDCF was changed
>>> in 3.0.2. I went on a whim and copied src/main/dcf.c from R 2.15.3
>>> over to 3.2.2, and R compiled fine and install.packages now work for
>>> me.
>>>
>>> This is probably not ideal, but it at least makes R usable on AIX for
>>> me. Would definitely like to help figure out what's wrong with the
>>> new dcf.c on AIX.
>>
>> I don't know if anyone on the core team has access to AIX, so you're
>> likely on your own for this.
>>
>> I'd suggest running R in a debugger (gdb or whatever you have), and
>> identifying exactly which line in dcf.c fails, and why. If you tell us
>> that, we might be able to spot what is going wrong.
>>
>> Duncan Murdoch
>>
>>>
>>> Thanks.
>>>
>>> -- Vinh
>>>
>>>
>>> On Mon, Sep 21, 2015 at 10:01 AM, Vinh Nguyen <vinhdizzo at gmail.com>
>>> wrote:
>>>> Hi there,
>>>>
>>>> I just wanted to follow up on this readDCF issue with install.packages
>>>> on AIX on R 3.*. I'm happy to help try potential solutions or debug
>>>> if anyone could point me in the right direction.
>>>>
>>>> To re-cap, it appears readDCF is segfault'ing since R 3.* on AIX.
>>>> This was not the case up until R 2.15.3. This makes install.packages
>>>> not usable. Thanks.
>>>>
>>>> -- Vinh
>>>>
>>>>
>>>> On Tue, Nov 11, 2014 at 10:23 AM, Vinh Nguyen <vinhdizzo at gmail.com>
>>>> wrote:
>>>>> Dear list (re-posting from r-help as r-devel is probably more
>>>>> appropriate),
>>>>>
>>>>> I was able to successfully compile R on our AIX box at work using the
>>>>> GNU compilers following the instructions on the R Administration
>>>>> guide. The output can be seen at here
>>>>> (https://gist.github.com/nguyenvinh/504321ea9c89d8919bef) and yields
>>>>> no errors .
>>>>>
>>>>> However, I get a segfault whenever I try to use the install.packages
>>>>> function to install packages. Using debug, I was able to trace it to
>>>>> the readDCF function:
>>>>>
>>>>> Browse[2]>
>>>>> debug: if (!all) return(.Internal(readDCF(file, fields, keep.white)))
>>>>> Browse[2]>
>>>>> debug: return(.Internal(readDCF(file, fields, keep.white)))
>>>>> Browse[2]>
>>>>>
>>>>> *** caught segfault ***
>>>>> address 4, cause 'invalid permissions'
>>>>>
>>>>> Possible actions:
>>>>> 1: abort (with core dump, if enabled)
>>>>> 2: normal R exit
>>>>> 3: exit R without saving workspace
>>>>> 4: exit R saving workspace
>>>>> Selection:
>>>>>
>>>>> Was curious if anyone has a clue on why such error exists or what I
>>>>> could do to fix it? I'm able to install packages via R CMD INSTALL,
>>>>> but I would hate to have to manually determine dependencies, download
>>>>> the source for each package, and install them "by hand" via R CMD
>>>>> INSTALL.
>>>>>
>>>>> I went back and compiled older versions of R to see if this error
>>>>> exists. On R 3.0.3, I get:
>>>>>
>>>>> debug(available.packages)
>>>>> install.packages('ggplot2', dep=TRUE,
>>>>> repo='http://cran.stat.ucla.edu')
>>>>> ...
>>>>> Browse[2]>
>>>>> debug: z <- res0 <- tryCatch(read.dcf(file = tmpf), error = identity)
>>>>> Browse[2]>
>>>>> Error: segfault from C stack overflow
>>>>>
>>>>> On R 2.15.3, I do not see the error.
>>>>>
>>>>> Would be great to get this resolved. Thank you for your help.
>>>>>
>>>>> -- Vinh
>>>
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fredhutch.org
Phone: (206) 667-5791
Fax: (206) 667-1319
More information about the R-devel
mailing list