[Rd] writeChar potential buffer overrun (PR#5090)

Prof Brian Ripley ripley at stats.ox.ac.uk
Sat Nov 15 05:54:20 MET 2003


Thanks, yes that all makes sense and checks out.
I decided to zero-pad the output, with a warning.

On Fri, 14 Nov 2003, Mark White wrote:

> Prof Brian Ripley writes:
> > Could you please give a reproducible example?
> 
> This breaks a new R session for me:
> 
>   zz <- file("/tmp/test.raw", "wb")
>   hdr <- ""
>   writeChar(hdr,zz,nchars=10000000)
> 
> Causes sig 11 on two machines, one with R 1.7.1 and the
> other with R 1.8.0 (both on NetBSD, so YMMV on other OSs, I
> suppose.)
> 
> > On Fri, 14 Nov 2003 mjw at celos.net wrote:
> > > Trying to copy the (binary) header of a input file directly
> > > to an output file, I've had repeatable seg faults.  The call:
> > > 
> > >   writeChar(hdr, outfh, nchars=6144)
> > > 
> > > when hdr just contains one empty string seems to be the
> > > culprit.  The stack traces weren't all that illuminating,
> > > with sig 11 in memory-related functions following this.  But
> > > in src/main/connections.c it looks like do_writechar doesn't
> > > check the length of strings when given an explicit nchars
> > > argument; so I think the strncpy() call will read too far.
> > 
> > All R strings should be null-terminated, so strncpy will only copy the
> > number of characters present (plus the null terminator) if less than n.
> 
> Quite true; I'd forgotten strncpy stopped at null.
> 
> > I can see that writeChars might write rubbish out, but not why it should 
> > segfault.  
> 
> Ok, I've just had a poke at it with ddd.  The above example
> faults in the memset() call (line 2772 connections.c) in both
> R versions.  I think the problem is underallocation of buf:
> 
>     len = 0;                            /* line 2757 */
>     for(i = 0; i < n; i++) {
> 	tlen = strlen(CHAR(STRING_ELT(object, i)));
> 	if (tlen > len) len = tlen;
>     }
>     buf = (char *) R_alloc(len + slen, sizeof(char));
> 
> which sets len to the longest string in object (in this
> case, 0 bytes), then allocates len+slen to buf.  gdb
> confirms len=0 and slen=1 at this point.  But a little later
> 
>     len = INTEGER(nchars)[i];           /* line 2770 */
>     s = CHAR(STRING_ELT(object, i));
>     memset(buf, '\0', len + slen);
>     strncpy(buf, s, len);
> 
> len is now set to [the first element of] nchars, which
> hasn't been checked, and is 10000000 (gdb confirms).  So the
> call to memset() copies way over the end of allocated buf.
> 
> Does that sound rational?  I'm not very familiar with R's
> internals.  I guess small overruns might not actually fault,
> because buf is within R's existing heap?
> 
> > It is also unclear to me what to do in this case: flag a user 
> > error?
> 
> I'm not sure; perhaps the most logical thing is indeed to
> pad with nulls (as I think it would do with this fault
> fixed), and not raise an error at all.
> 
> > > Using readBin and writeBin with integer() and size=1 seems
> > > to be the solution for header copying, but the faults still
> > > seemed worth reporting.
> > 
> > It's certainly the documented way.
> 
> Yup.  I tried readChar/writeChar thinking it might be
> quicker (no type conversion?) but it's obviously not
> feasible with null-terminated strings.
> 
> Mark <><
> 
> 

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list