[Rd] A Scanner and Parser for R

Zed Shaw zedshaw at zedshaw.com
Thu Apr 24 12:23:14 MEST 2003


Hi All,

I have recently written an R scanner and parser for the Obversive 
project.  I wrote it using re2c and lemon in order to make sure that it 
doesn't require any external libraries.  It currently only has 9 
conflicts which I can't quite figure out how to resolve and I'm sure it 
doesn't parse exactly correct.  One thing I can't quite fix is the 
ambiguous use of newlines in R.  I know that R currently uses a "work 
around" where there is an empty production called cr and this is used 
to tell the scanner when to start ignoring newlines.  I'll most likely 
end up doing the same thing.

I'm sharing this with the R dev team to get some input into it's 
correctness.  I'm curious about general opinions, usefulness, etc.  The 
main purpose of this parser is to give users better feedback on R code 
before they submit it to R for processing.  We're not trying explicitly 
to replicate R's parsing behavior, but more trying to enforce a 
particular style.  This is because we generate R code from GUI 
interfaces, so the user isn't writing code, only reviewing it for 
errors.

The main advantages that this parser has over the current R parser is 
that it is damn fast and doesn't require external libraries.  My own 
informal tests on my machine shows that it can parse 25M of R code in 
about 40 seconds, including error detection, reporting, recovery, and 
logging of the entire parsing internal state.  It is also designed to 
have no memory leaks, which is admittedly really easy when all you do 
is parse.

Anyway, send me your thoughts on this.  Thanks.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: parser.y
Type: application/octet-stream
Size: 5650 bytes
Desc: not available
Url : https://www.stat.math.ethz.ch/pipermail/r-devel/attachments/20030424/25fc7e79/parser.exe
-------------- next part --------------
A non-text attachment was scrubbed...
Name: scanner.re
Type: application/octet-stream
Size: 5143 bytes
Desc: not available
Url : https://www.stat.math.ethz.ch/pipermail/r-devel/attachments/20030424/25fc7e79/scanner.exe


More information about the R-devel mailing list