[Rd] parse( connection) and source-keeping
Duncan Murdoch
murdoch.duncan at gmail.com
Thu Jan 12 14:57:00 CET 2012
On 11/01/2012 8:36 PM, Duncan Murdoch wrote:
> On 12-01-11 3:54 PM, Mark.Bravington at csiro.au wrote:
> > In R<= 2.13.x, calling 'parse( con)' where 'con' is a connection, 'options( keep.source)' is TRUE, and default 'srcfile' would preserve the source. In R>= 2.14.1, it doesn't.
>
> Actually, it preserved the "source" attribute of the function if it
> could, but didn't add a srcref. Sometimes it would fail, giving a
> message like
>
> Error in parse(textConnection(texto)) :
> function is too long to keep source (at line 8812)
>
>
> >
> >> tf<- tempfile()
> >> options( keep.source=TRUE)
> >> texto<- c( 'function() { # comment', '}')
> >> parse( text=texto)
> > expression(function() { # comment
> > })
> >> cat( texto, file=tf, sep='\n')
> >> parse( file=tf)
> > expression(function() { # comment
> > })
> >> parse( file( tf))
> > expression(function() {
> > })
> >> parse( textConnection( texto))
> > expression(function() {
> > })
> >
> > and yes I didn't bother closing any connections.
> >
> > My suspicion is that this change is unintentional, and it seems to me that the best option would be for 'connection' to work like 'text' does here, ie to attach a 'srcfilecopy' containing the contents.
>
> Yes, that does sound like a good idea.
I've taken a look, and this doesn't look like something I'll fix for
2.15.0. Here's why:
The entry points to the parser are really quite a mess, and need to be
cleaned up: working around this problem without that cleanup would make
them messier, and I don't have time for the cleanup before 2.15.0.
Part of the problem is that connections are so flexible: the parser
doesn't know whether the connection passed to it is at the beginning, or
whether you've already read some lines from it; it might not even have a
beginning (e.g. stdin()).
There is a relatively easy workaround if you really need this: you can
make the srcfilecopy yourself, and pass it as the "srcfile" argument to
parse. (This won't work on the stdin() connection, but if you're the
one creating the connection, you can perhaps work around that. By the
time parse() is called, it's too late.)
The parser does manage to handle input coming from the console, because
that case uses a different entry point to the parser, and it keeps a
buffer of all input. (It needs to do this because you might not be
finished typing yet, and it will start over again when you enter the
next line.) So connections (even stdin()) could be handled in the same
way, and when I do the big cleanup, that's probably what will happen.
If you have a particular use case in mind and the workaround above isn't
sufficient, let me know.
Duncan Murdoch
More information about the R-devel
mailing list