[Rd] parser does not catch strings without closing quote

William Dunlap wdunlap at tibco.com
Fri Sep 2 17:28:09 CEST 2011


By the way, I noticed the problem in R because S+ could not
parse a file in the CRAN package SAPP because it ended with
a garbage line with a quote in it:

  % tail -3 SAPP/data/res2003JUL26.R
  res2003JUL26 <- data.frame(res2003JUL26)
  names(res2003JUL26) <- c("no.", "longitude", "latitude", "magnitude", "time", "depth", "trans.time")
  ")

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

> -----Original Message-----
> From: Duncan Murdoch [mailto:murdoch.duncan at gmail.com]
> Sent: Friday, September 02, 2011 4:58 AM
> To: William Dunlap
> Cc: r-devel at r-project.org
> Subject: Re: [Rd] parser does not catch strings without closing quote
> 
> On 11-09-01 6:24 PM, William Dunlap wrote:
> > Shouldn't the parser complain about unfinished strings in files?
> > It doesn't and will tack on a newline if there isn't one there.
> >
> >    >  withOption<- function(optionList, expr) {
> >    +     oldOption<- options(optionList)
> >    +     on.exit(options(oldOption))
> >    +     expr
> >    + }
> >
> >    >  cat(file=tf<-tempfile(), "\"string without closing quote\n")
> >    >  p<- withOption(list(keep.source=FALSE), parse(tf))
> >    >  p
> >    expression("string without closing quote\n")
> >
> >    >  cat(file=tf<-tempfile(), "\"string with no closing quote nor newline")
> >    >  p<- withOption(list(keep.source=FALSE), parse(tf))
> >    >  p
> >    expression("string with no closing quote nor newline\n")
> >
> > It does complain when parsing a character string with the same problem.
> >    >  p<- withOption(list(keep.source=FALSE), parse(text="\"unfinished string"))
> >    Error in parse(text = "\"unfinished string") :
> >      2:0: unexpected end of input
> >    1: "unfinished string
> >      ^
> 
> I assume this is a bug, but the way the parser handles input is quite a
> mess, so I'm not sure where to fix this.  The obvious place (within the
> parser where it is getting tokens) does not work:  the higher level code
> breaks up input into small pieces, and the parser frequently hits the
> end of a piece (at a newline or semicolon, for example), and signals an
> incomplete parse, which is restarted.
> 
> (For others than Bill:  this is necessary because the S language doesn't
> have a clear end of statement marker.  If the parser sees "x + ", it
> tries to get more input to finish the statement.  It's only an error if
> nothing more is there.)
> 
> A possibility would be to add a new token "incomplete string", which
> will eventually trigger an error if the restart doesn't complete it.
> 
> I'll think about it...
> 
> Duncan Murdoch



More information about the R-devel mailing list