[Rd] parser does not catch strings without closing quote
William Dunlap
wdunlap at tibco.com
Fri Sep 2 17:28:09 CEST 2011
By the way, I noticed the problem in R because S+ could not
parse a file in the CRAN package SAPP because it ended with
a garbage line with a quote in it:
% tail -3 SAPP/data/res2003JUL26.R
res2003JUL26 <- data.frame(res2003JUL26)
names(res2003JUL26) <- c("no.", "longitude", "latitude", "magnitude", "time", "depth", "trans.time")
")
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
> -----Original Message-----
> From: Duncan Murdoch [mailto:murdoch.duncan at gmail.com]
> Sent: Friday, September 02, 2011 4:58 AM
> To: William Dunlap
> Cc: r-devel at r-project.org
> Subject: Re: [Rd] parser does not catch strings without closing quote
>
> On 11-09-01 6:24 PM, William Dunlap wrote:
> > Shouldn't the parser complain about unfinished strings in files?
> > It doesn't and will tack on a newline if there isn't one there.
> >
> > > withOption<- function(optionList, expr) {
> > + oldOption<- options(optionList)
> > + on.exit(options(oldOption))
> > + expr
> > + }
> >
> > > cat(file=tf<-tempfile(), "\"string without closing quote\n")
> > > p<- withOption(list(keep.source=FALSE), parse(tf))
> > > p
> > expression("string without closing quote\n")
> >
> > > cat(file=tf<-tempfile(), "\"string with no closing quote nor newline")
> > > p<- withOption(list(keep.source=FALSE), parse(tf))
> > > p
> > expression("string with no closing quote nor newline\n")
> >
> > It does complain when parsing a character string with the same problem.
> > > p<- withOption(list(keep.source=FALSE), parse(text="\"unfinished string"))
> > Error in parse(text = "\"unfinished string") :
> > 2:0: unexpected end of input
> > 1: "unfinished string
> > ^
>
> I assume this is a bug, but the way the parser handles input is quite a
> mess, so I'm not sure where to fix this. The obvious place (within the
> parser where it is getting tokens) does not work: the higher level code
> breaks up input into small pieces, and the parser frequently hits the
> end of a piece (at a newline or semicolon, for example), and signals an
> incomplete parse, which is restarted.
>
> (For others than Bill: this is necessary because the S language doesn't
> have a clear end of statement marker. If the parser sees "x + ", it
> tries to get more input to finish the statement. It's only an error if
> nothing more is there.)
>
> A possibility would be to add a new token "incomplete string", which
> will eventually trigger an error if the restart doesn't complete it.
>
> I'll think about it...
>
> Duncan Murdoch
More information about the R-devel
mailing list