[Rd] segfault with readDCF on R 3.1.2 on AIX 6.1 when using install.packages

Hervé Pagès hpages at fredhutch.org
Mon Sep 21 22:50:46 CEST 2015


Hi,

Note that one significant change to read.dcf() that happened since R
3.0.2 is the addition of support for arbitrary long lines (commit
63281), which never worked:

   dcf <- paste(c("aa: ", rep(letters, length.out=10000)), collapse="")
   writeLines(dcf, "test.dcf")
   nchar(read.dcf("test.dcf"))
   #        aa
   # [1,] 8186

The culprit being line 53 in src/main/dcf.c where the author of the
Rconn_getline2() function only copies 'nbuf' chars from 'buf' to 'buf2'
when in fact 'nbuf + 1' chars have been stored in 'buf' so far.

Quickest fix:

Index: src/main/dcf.c
===================================================================
--- src/main/dcf.c	(revision 69404)
+++ src/main/dcf.c	(working copy)
@@ -50,7 +50,7 @@
  	if(nbuf+2 >= bufsize) { // allow for terminator below
  	    bufsize *= 2;
  	    char *buf2 = R_alloc(bufsize, sizeof(char));
-	    memcpy(buf2, buf, nbuf);
+	    memcpy(buf2, buf, nbuf + 1);
  	    buf = buf2;
  	}
  	if(c != '\n'){

However a better fix would be to have 'nbuf' actually contain the nb
of chars that was stored in 'buf' so far (as it name suggests):

Index: src/main/dcf.c
===================================================================
--- src/main/dcf.c	(revision 69404)
+++ src/main/dcf.c	(working copy)
@@ -42,12 +42,12 @@
  /* Use R_alloc as this might get interrupted */
  static char *Rconn_getline2(Rconnection con)
  {
-    int c, bufsize = MAXELTSIZE, nbuf = -1;
+    int c, bufsize = MAXELTSIZE, nbuf = 0;
      char *buf;

      buf = R_alloc(bufsize, sizeof(char));
      while((c = Rconn_fgetc(con)) != R_EOF) {
-	if(nbuf+2 >= bufsize) { // allow for terminator below
+	if(nbuf+1 >= bufsize) { // allow for terminator below
  	    bufsize *= 2;
  	    char *buf2 = R_alloc(bufsize, sizeof(char));
  	    memcpy(buf2, buf, nbuf);
@@ -54,17 +54,19 @@
  	    buf = buf2;
  	}
  	if(c != '\n'){
-	    buf[++nbuf] = (char) c;
+	    buf[nbuf++] = (char) c;
  	} else {
-	    buf[++nbuf] = '\0';
+	    buf[nbuf++] = '\0';
  	    break;
  	}
      }
+    if (nbuf == 0)
+        return NULL;
      /* Make sure it is null-terminated even if file did not end with
       *  newline.
       */
-    if(nbuf >= 0 && buf[nbuf]) buf[++nbuf] = '\0';
-    return (nbuf == -1) ? NULL: buf;
+    buf[nbuf-1] = '\0';
+    return buf;
  }

That improves readability and reduces the risk of bugs.

Also note that Rconn_getline2() allocates a new buffer for each line in
the DCF file. So we got support for arbitrary long lines (a rare
situation) at the price of a slow down and increased memory usage for
all DCF files. Sounds less than optimal :-/

Cheers,
H.


On 09/21/2015 11:01 AM, Duncan Murdoch wrote:
> On 21/09/2015 1:49 PM, Vinh Nguyen wrote:
>> Here's an update:
>>
>> I checked the ChangeLog for R, and it seems like readDCF was changed
>> in 3.0.2.  I went on a whim and copied src/main/dcf.c from R 2.15.3
>> over to 3.2.2, and R compiled fine and install.packages now work for
>> me.
>>
>> This is probably not ideal, but it at least makes R usable on AIX for
>> me.  Would definitely like to help figure out what's wrong with the
>> new dcf.c on AIX.
>
> I don't know if anyone on the core team has access to AIX, so you're
> likely on your own for this.
>
> I'd suggest running R in a debugger (gdb or whatever you have), and
> identifying exactly which line in dcf.c fails, and why.  If you tell us
> that, we might be able to spot what is going wrong.
>
> Duncan Murdoch
>
>>
>> Thanks.
>>
>> -- Vinh
>>
>>
>> On Mon, Sep 21, 2015 at 10:01 AM, Vinh Nguyen <vinhdizzo at gmail.com> wrote:
>>> Hi there,
>>>
>>> I just wanted to follow up on this readDCF issue with install.packages
>>> on AIX on R 3.*.  I'm happy to help try potential solutions or debug
>>> if anyone could point me in the right direction.
>>>
>>> To re-cap, it appears readDCF is segfault'ing since R 3.* on AIX.
>>> This was not the case up until R 2.15.3.  This makes install.packages
>>> not usable.  Thanks.
>>>
>>> -- Vinh
>>>
>>>
>>> On Tue, Nov 11, 2014 at 10:23 AM, Vinh Nguyen <vinhdizzo at gmail.com> wrote:
>>>> Dear list (re-posting from r-help as r-devel is probably more appropriate),
>>>>
>>>> I was able to successfully compile R on our AIX box at work using the
>>>> GNU compilers following the instructions on the R Administration
>>>> guide.  The output can be seen at here
>>>> (https://gist.github.com/nguyenvinh/504321ea9c89d8919bef) and yields
>>>> no errors .
>>>>
>>>> However, I get a segfault whenever I try to use the install.packages
>>>> function to install packages.  Using debug, I was able to trace it to
>>>> the readDCF function:
>>>>
>>>> Browse[2]>
>>>> debug: if (!all) return(.Internal(readDCF(file, fields, keep.white)))
>>>> Browse[2]>
>>>> debug: return(.Internal(readDCF(file, fields, keep.white)))
>>>> Browse[2]>
>>>>
>>>>   *** caught segfault ***
>>>> address 4, cause 'invalid permissions'
>>>>
>>>> Possible actions:
>>>> 1: abort (with core dump, if enabled)
>>>> 2: normal R exit
>>>> 3: exit R without saving workspace
>>>> 4: exit R saving workspace
>>>> Selection:
>>>>
>>>> Was curious if anyone has a clue on why such error exists or what I
>>>> could do to fix it?  I'm able to install packages via R CMD INSTALL,
>>>> but I would hate to have to manually determine dependencies, download
>>>> the source for each package, and install them "by hand" via R CMD
>>>> INSTALL.
>>>>
>>>> I went back and compiled older versions of R to see if this error
>>>> exists.  On R 3.0.3, I get:
>>>>
>>>> debug(available.packages)
>>>> install.packages('ggplot2', dep=TRUE, repo='http://cran.stat.ucla.edu')
>>>> ...
>>>> Browse[2]>
>>>> debug: z <- res0 <- tryCatch(read.dcf(file = tmpf), error = identity)
>>>> Browse[2]>
>>>> Error: segfault from C stack overflow
>>>>
>>>> On R 2.15.3, I do not see the error.
>>>>
>>>> Would be great to get this resolved.  Thank you for your help.
>>>>
>>>> -- Vinh
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the R-devel mailing list