[Rd] Line splitting in system() (PR#6624)

mjw at celos.net mjw at celos.net
Sat Feb 28 16:33:33 MET 2004


According to the manual, system() splits output lines into
8096-char chunks; under UNIX, actually seems to return 8094
chars, and drop the 8095th.  Spot missing digits in:

  x2 <- 
    system("perl -e 'print \"0123456789\"x10000'",
    intern=T)

Looks like a bug in the code to remove newlines at
src/unix/sys-unix.c:218 -- fgets() reads size-1 characters
and adds null, so strlen(buf)<size always true.  Testing for
'\n' explicitly is probably better (deals with 8094 chr + \n
case) -- it turns out the win32 code already does this
anyway.  (IIRC the read>0 condition in the win32 code would
be redundant but I copied it anyway to be safe.)

Anyway, rather trivial diff below.  Both manpages should
probably say 8095 rather than 8096, I think.

Mark <><

Index: library/base/man/unix/system.Rd
===================================================================
RCS file: /cvs/R/src/library/base/man/unix/system.Rd,v
retrieving revision 1.2
diff -u -r1.2 system.Rd
--- library/base/man/unix/system.Rd	2002/12/08 09:50:47	1.2
+++ library/base/man/unix/system.Rd	2004/02/28 15:20:09
@@ -26,7 +26,7 @@
   If \code{intern} is \code{TRUE} then \code{popen} is used to invoke the
   command and the output collected, line by line, into an \R
   \code{\link{character}} vector which is returned as the value of
-  \code{system}.  Output lines of more that 8096 characters will be split.
+  \code{system}.  Output lines of more that 8095 characters will be split.
 
   If \code{intern} is \code{FALSE} then the C function \code{system}
   is used to invoke the command and the value returned by \code{system}
Index: library/base/man/windows/system.Rd
===================================================================
RCS file: /cvs/R/src/library/base/man/windows/system.Rd,v
retrieving revision 1.15
diff -u -r1.15 system.Rd
--- library/base/man/windows/system.Rd	2003/05/08 21:45:54	1.15
+++ library/base/man/windows/system.Rd	2004/02/28 15:20:09
@@ -33,7 +33,7 @@
   If \code{intern = TRUE}, a character vector giving the output of the
   command, one line per character string. If the command could not be
   run or gives an error a \R error is generated.
-  (Output ines of more that 8096 characters will be split.)
+  (Output lines of more that 8095 characters will be split.)
 
   If \code{intern = FALSE}, the return value is a error code, given the
   invisible attribute (so needs to be printed explicitly). If the
Index: unix/sys-unix.c
===================================================================
RCS file: /cvs/R/src/unix/sys-unix.c,v
retrieving revision 1.39
diff -u -r1.39 sys-unix.c
--- unix/sys-unix.c	2003/09/10 11:45:29	1.39
+++ unix/sys-unix.c	2004/02/28 15:20:15
@@ -215,7 +215,8 @@
 	fp = R_popen(CHAR(STRING_ELT(CAR(args), 0)), x);
 	for (i = 0; fgets(buf, INTERN_BUFSIZE, fp); i++) {
 	    read = strlen(buf);
-	    if (read < INTERN_BUFSIZE) buf[read - 1] = '\0'; /* chop final CR */
+	    if (read>0 && buf[read-1] == '\n') 
+		buf[read - 1] = '\0'; /* chop final CR */
 	    tchar = mkChar(buf);
 	    UNPROTECT(1);
 	    PROTECT(tlist = CONS(tchar, tlist));



More information about the R-devel mailing list