[Rd] segfault with readDCF on R 3.1.2 on AIX 6.1 when using install.packages

Duncan Murdoch murdoch.duncan at gmail.com
Mon Sep 21 23:48:02 CEST 2015


On 21/09/2015 4:50 PM, Hervé Pagès wrote:
> Hi,
> 
> Note that one significant change to read.dcf() that happened since R
> 3.0.2 is the addition of support for arbitrary long lines (commit
> 63281), which never worked:
> 
>    dcf <- paste(c("aa: ", rep(letters, length.out=10000)), collapse="")
>    writeLines(dcf, "test.dcf")
>    nchar(read.dcf("test.dcf"))
>    #        aa
>    # [1,] 8186
> 

I don't see that in R 3.2.2 on OSX or 3.2.2 patched on Windows:

>    nchar(read.dcf("test.dcf"))
        aa
[1,] 10000

Duncan Murdoch

> The culprit being line 53 in src/main/dcf.c where the author of the
> Rconn_getline2() function only copies 'nbuf' chars from 'buf' to 'buf2'
> when in fact 'nbuf + 1' chars have been stored in 'buf' so far.
> 
> Quickest fix:
> 
> Index: src/main/dcf.c
> ===================================================================
> --- src/main/dcf.c	(revision 69404)
> +++ src/main/dcf.c	(working copy)
> @@ -50,7 +50,7 @@
>   	if(nbuf+2 >= bufsize) { // allow for terminator below
>   	    bufsize *= 2;
>   	    char *buf2 = R_alloc(bufsize, sizeof(char));
> -	    memcpy(buf2, buf, nbuf);
> +	    memcpy(buf2, buf, nbuf + 1);
>   	    buf = buf2;
>   	}
>   	if(c != '\n'){
> 
> However a better fix would be to have 'nbuf' actually contain the nb
> of chars that was stored in 'buf' so far (as it name suggests):
> 
> Index: src/main/dcf.c
> ===================================================================
> --- src/main/dcf.c	(revision 69404)
> +++ src/main/dcf.c	(working copy)
> @@ -42,12 +42,12 @@
>   /* Use R_alloc as this might get interrupted */
>   static char *Rconn_getline2(Rconnection con)
>   {
> -    int c, bufsize = MAXELTSIZE, nbuf = -1;
> +    int c, bufsize = MAXELTSIZE, nbuf = 0;
>       char *buf;
> 
>       buf = R_alloc(bufsize, sizeof(char));
>       while((c = Rconn_fgetc(con)) != R_EOF) {
> -	if(nbuf+2 >= bufsize) { // allow for terminator below
> +	if(nbuf+1 >= bufsize) { // allow for terminator below
>   	    bufsize *= 2;
>   	    char *buf2 = R_alloc(bufsize, sizeof(char));
>   	    memcpy(buf2, buf, nbuf);
> @@ -54,17 +54,19 @@
>   	    buf = buf2;
>   	}
>   	if(c != '\n'){
> -	    buf[++nbuf] = (char) c;
> +	    buf[nbuf++] = (char) c;
>   	} else {
> -	    buf[++nbuf] = '\0';
> +	    buf[nbuf++] = '\0';
>   	    break;
>   	}
>       }
> +    if (nbuf == 0)
> +        return NULL;
>       /* Make sure it is null-terminated even if file did not end with
>        *  newline.
>        */
> -    if(nbuf >= 0 && buf[nbuf]) buf[++nbuf] = '\0';
> -    return (nbuf == -1) ? NULL: buf;
> +    buf[nbuf-1] = '\0';
> +    return buf;
>   }
> 
> That improves readability and reduces the risk of bugs.
> 
> Also note that Rconn_getline2() allocates a new buffer for each line in
> the DCF file. So we got support for arbitrary long lines (a rare
> situation) at the price of a slow down and increased memory usage for
> all DCF files. Sounds less than optimal :-/
> 
> Cheers,
> H.
> 
> 
> On 09/21/2015 11:01 AM, Duncan Murdoch wrote:
>> On 21/09/2015 1:49 PM, Vinh Nguyen wrote:
>>> Here's an update:
>>>
>>> I checked the ChangeLog for R, and it seems like readDCF was changed
>>> in 3.0.2.  I went on a whim and copied src/main/dcf.c from R 2.15.3
>>> over to 3.2.2, and R compiled fine and install.packages now work for
>>> me.
>>>
>>> This is probably not ideal, but it at least makes R usable on AIX for
>>> me.  Would definitely like to help figure out what's wrong with the
>>> new dcf.c on AIX.
>>
>> I don't know if anyone on the core team has access to AIX, so you're
>> likely on your own for this.
>>
>> I'd suggest running R in a debugger (gdb or whatever you have), and
>> identifying exactly which line in dcf.c fails, and why.  If you tell us
>> that, we might be able to spot what is going wrong.
>>
>> Duncan Murdoch
>>
>>>
>>> Thanks.
>>>
>>> -- Vinh
>>>
>>>
>>> On Mon, Sep 21, 2015 at 10:01 AM, Vinh Nguyen <vinhdizzo at gmail.com> wrote:
>>>> Hi there,
>>>>
>>>> I just wanted to follow up on this readDCF issue with install.packages
>>>> on AIX on R 3.*.  I'm happy to help try potential solutions or debug
>>>> if anyone could point me in the right direction.
>>>>
>>>> To re-cap, it appears readDCF is segfault'ing since R 3.* on AIX.
>>>> This was not the case up until R 2.15.3.  This makes install.packages
>>>> not usable.  Thanks.
>>>>
>>>> -- Vinh
>>>>
>>>>
>>>> On Tue, Nov 11, 2014 at 10:23 AM, Vinh Nguyen <vinhdizzo at gmail.com> wrote:
>>>>> Dear list (re-posting from r-help as r-devel is probably more appropriate),
>>>>>
>>>>> I was able to successfully compile R on our AIX box at work using the
>>>>> GNU compilers following the instructions on the R Administration
>>>>> guide.  The output can be seen at here
>>>>> (https://gist.github.com/nguyenvinh/504321ea9c89d8919bef) and yields
>>>>> no errors .
>>>>>
>>>>> However, I get a segfault whenever I try to use the install.packages
>>>>> function to install packages.  Using debug, I was able to trace it to
>>>>> the readDCF function:
>>>>>
>>>>> Browse[2]>
>>>>> debug: if (!all) return(.Internal(readDCF(file, fields, keep.white)))
>>>>> Browse[2]>
>>>>> debug: return(.Internal(readDCF(file, fields, keep.white)))
>>>>> Browse[2]>
>>>>>
>>>>>   *** caught segfault ***
>>>>> address 4, cause 'invalid permissions'
>>>>>
>>>>> Possible actions:
>>>>> 1: abort (with core dump, if enabled)
>>>>> 2: normal R exit
>>>>> 3: exit R without saving workspace
>>>>> 4: exit R saving workspace
>>>>> Selection:
>>>>>
>>>>> Was curious if anyone has a clue on why such error exists or what I
>>>>> could do to fix it?  I'm able to install packages via R CMD INSTALL,
>>>>> but I would hate to have to manually determine dependencies, download
>>>>> the source for each package, and install them "by hand" via R CMD
>>>>> INSTALL.
>>>>>
>>>>> I went back and compiled older versions of R to see if this error
>>>>> exists.  On R 3.0.3, I get:
>>>>>
>>>>> debug(available.packages)
>>>>> install.packages('ggplot2', dep=TRUE, repo='http://cran.stat.ucla.edu')
>>>>> ...
>>>>> Browse[2]>
>>>>> debug: z <- res0 <- tryCatch(read.dcf(file = tmpf), error = identity)
>>>>> Browse[2]>
>>>>> Error: segfault from C stack overflow
>>>>>
>>>>> On R 2.15.3, I do not see the error.
>>>>>
>>>>> Would be great to get this resolved.  Thank you for your help.
>>>>>
>>>>> -- Vinh
>>>
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>



More information about the R-devel mailing list