[Rd] Changes to parser in R-devel

Fri Jul 20 00:50:12 CEST 2012

On 12-07-19 4:41 PM, Yihui Xie wrote:
> I'm not sure if there is a bug somewhere; see this example:

There's definitely a bug in the handling of empty lists, such as the 
empty list of commands in your first example and the empty list of 
arguments in your second.  There's a partial workaround currently in 
R-devel, but not a perfect fix.  (This is due to me missing a conversion 
from Romain's 0-based column counting to the usual 1-based counting.)

I expect it will be fixed tomorrow, or sooner.

Duncan Murdoch

>
> getParseData(parse(text='function(x){}'))
>
>    line1 col1 line2 col2 id parent          token terminal     text
> 1     1    1     1    8  1     11       FUNCTION     TRUE function
> 2     1    9     1    9  2     11            '('     TRUE        (
> 3     1   10     1   10  3      5 SYMBOL_FORMALS     TRUE        x
> 4     1   11     1   11  4     11            ')'     TRUE        )
> 5     1   12     1   12  6      8            '{'     TRUE        {
> 6     1   13     1   13  7      8            '}'     TRUE        }
> 7     1   12     1   12  5     11            '}'     TRUE        {
> 8     1   12     1   13  8     11           expr    FALSE
> 9     1    1     1   13 11      0           expr    FALSE
>
> I get an additional { in the 7th row of the 'text' column.
>
> Another problem is that for this empty function below, there will be
> an obvious pause if you run it more than once:
>
> getParseData(parse(text='function(){}'))
>
> and you may get wild line/col numbers like this:
>
>     line1 col1     line2 col2 id parent    token terminal     text
> 1      1    1         1    8  1      9 FUNCTION     TRUE function
> 2      1    9         1    9  2      9      '('     TRUE        (
> 3      1   10         1   10  3      9      ')'     TRUE        )
> 4      1   11         1   11  4      6      '{'     TRUE        {
> 5      1   12         1   12  5      6      '}'     TRUE        }
> 6 320024   11 140106360   11 11      9      '}'     TRUE
> 7      1   11         1   12  6      9     expr    FALSE
> 8      1    1         1   12  9     11     expr    FALSE
>
> What is worse is it can crash R:
>
>   *** caught segfault ***
> address 0x9488c20, cause 'memory not mapped'
>
> Traceback:
>   1: parse(text = "function(){}")
>   2: getSrcref(x)
>   3: getSrcfile(x)
>   4: getParseData(parse(text = "function(){}"))
>
>
>> sessionInfo()
> R Under development (unstable) (2012-07-18 r59904)
> Platform: i686-pc-linux-gnu (32-bit)
>
> locale:
>   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>   [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>   [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>   [7] LC_PAPER=C                 LC_NAME=C
>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
>
> Regards,
> Yihui
> --
> Yihui Xie <xieyihui at gmail.com>
> Phone: 515-294-2465 Web: http://yihui.name
> Department of Statistics, Iowa State University
> 2215 Snedecor Hall, Ames, IA
>
>
> On Wed, Jul 18, 2012 at 2:31 PM, Duncan Murdoch
> <murdoch.duncan at gmail.com> wrote:
>> I have just committed (in r59883) some changes to the R parser based on
>> Romain Francois' parser package.  Packages that made use of parser will
>> hopefully find that the information in base R gives them what they need to
>> work with, but the data is not identical to
>> what parser recorded (since it was not consistent with some things already
>> in R).  One reason for the change was that the parser in the parser package
>> was slightly different than the one in R; the hope is that by providing the
>> services in R, it will make maintenance easier for things like code
>> analysis, pretty printing, etc.
>>
>> See ?getParseData for details, and if you are maintaining a package that
>> depends on parser, feel free to ask me for help in the transition, or make
>> suggestions for changes if I've done something that causes you too much
>> trouble.
>>
>> Duncan Murdoch
>>
>> P.S. to Qiang Li:  as mentioned privately, the goal for this change was to
>> reproduce output equivalent to what parser did, so I have not incorporated
>> your suggested change to outlaw expressions like "x[[1] ]"  (with an
>> embedded space where it shouldn't be).  After things settle down we can
>> consider that change and others.
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel