[Rd] Changes to parser in R-devel
Yihui Xie
xie at yihui.name
Thu Jul 19 22:41:12 CEST 2012
I'm not sure if there is a bug somewhere; see this example:
getParseData(parse(text='function(x){}'))
line1 col1 line2 col2 id parent token terminal text
1 1 1 1 8 1 11 FUNCTION TRUE function
2 1 9 1 9 2 11 '(' TRUE (
3 1 10 1 10 3 5 SYMBOL_FORMALS TRUE x
4 1 11 1 11 4 11 ')' TRUE )
5 1 12 1 12 6 8 '{' TRUE {
6 1 13 1 13 7 8 '}' TRUE }
7 1 12 1 12 5 11 '}' TRUE {
8 1 12 1 13 8 11 expr FALSE
9 1 1 1 13 11 0 expr FALSE
I get an additional { in the 7th row of the 'text' column.
Another problem is that for this empty function below, there will be
an obvious pause if you run it more than once:
getParseData(parse(text='function(){}'))
and you may get wild line/col numbers like this:
line1 col1 line2 col2 id parent token terminal text
1 1 1 1 8 1 9 FUNCTION TRUE function
2 1 9 1 9 2 9 '(' TRUE (
3 1 10 1 10 3 9 ')' TRUE )
4 1 11 1 11 4 6 '{' TRUE {
5 1 12 1 12 5 6 '}' TRUE }
6 320024 11 140106360 11 11 9 '}' TRUE
7 1 11 1 12 6 9 expr FALSE
8 1 1 1 12 9 11 expr FALSE
What is worse is it can crash R:
*** caught segfault ***
address 0x9488c20, cause 'memory not mapped'
Traceback:
1: parse(text = "function(){}")
2: getSrcref(x)
3: getSrcfile(x)
4: getParseData(parse(text = "function(){}"))
> sessionInfo()
R Under development (unstable) (2012-07-18 r59904)
Platform: i686-pc-linux-gnu (32-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
Regards,
Yihui
--
Yihui Xie <xieyihui at gmail.com>
Phone: 515-294-2465 Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA
On Wed, Jul 18, 2012 at 2:31 PM, Duncan Murdoch
<murdoch.duncan at gmail.com> wrote:
> I have just committed (in r59883) some changes to the R parser based on
> Romain Francois' parser package. Packages that made use of parser will
> hopefully find that the information in base R gives them what they need to
> work with, but the data is not identical to
> what parser recorded (since it was not consistent with some things already
> in R). One reason for the change was that the parser in the parser package
> was slightly different than the one in R; the hope is that by providing the
> services in R, it will make maintenance easier for things like code
> analysis, pretty printing, etc.
>
> See ?getParseData for details, and if you are maintaining a package that
> depends on parser, feel free to ask me for help in the transition, or make
> suggestions for changes if I've done something that causes you too much
> trouble.
>
> Duncan Murdoch
>
> P.S. to Qiang Li: as mentioned privately, the goal for this change was to
> reproduce output equivalent to what parser did, so I have not incorporated
> your suggested change to outlaw expressions like "x[[1] ]" (with an
> embedded space where it shouldn't be). After things settle down we can
> consider that change and others.
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
More information about the R-devel
mailing list