[Rd] access to R parse tree for Lisp-style macros?
Duncan Murdoch
murdoch at stats.uwo.ca
Mon Oct 3 15:42:50 CEST 2005
On 10/3/2005 3:25 AM, Andrew Piskorski wrote:
> R folks, I'm curious about possible support for Lisp-style macros in
> R. I'm aware of the "defmacro" support for S-Plus and R discussed
> here:
>
> http://www.biostat.wustl.edu/archives/html/s-news/2002-10/msg00064.html
>
> but that's really just a syntactic short-cut to the run-time use of
> substitute() and eval(), which you could manually put into a function
> yourself if you cared too. (AKA, not at all equivalent to Lisp
> macros.) The mlocal() function in mvbutils also has seemingly similar
> macro-using-eval properties:
>
> http://cran.r-project.org/src/contrib/Descriptions/mvbutils.html
> http://www.maths.lth.se/help/R/.R/library/mvbutils/html/mlocal.html
>
> I could of course pre-process R source code, either using a custom
> script or something like M5:
>
> http://www.soe.ucsc.edu/~brucem/samples.html
> http://groups.google.com/group/comp.compilers/browse_thread/thread/8ece2f34620f7957/000475ab31140327
>
> But that's not what I'm asking about here. As I understand it,
> Lisp-style macros manipulate the already-parsed syntax tree. This
> seems very uncommon in non-Lisp languages and environments, but some -
> like Python - do have such support. (I don't use Python, but I'm told
> that its standard parser APIs are as powerful as Lisp macros, although
> clunkier to use.)
>
> Is implementing Lisp-style macros feasible in R? Has anyone
> investigated this or tried to do it?
>
> What internal representation does R use for its parse tree, and how
> could I go about manipulating it in some fashion, either at package
> build time or at run time, in order to support true Lisp-style macros?
It is like a list of lists, with modes attached that say how they are to
be interpreted. parse() gives a list of mode "expression", containing a
list of function calls or atomic objects. Function calls are stored as
a list whose head is the function name with subsequent entries being the
arguments.
The mode may be "expression", or "call", or others, depending on what
you are actually dealing with.
>
> Whenever I try something like this in R:
>
> > dput(parse(text="1+2"))
> expression(1 + 2)
>
> what I see looks exactly like R code - that '1 + 2' expression doesn't
> look very "parsed" to me. Is that really it, or is there some sort of
> Scheme-like parse tree hiding underneath? I see that the interactive
> Read-Eval-Print loop basically calls R_Parse1() in "src/main/gram.c",
> but from there I'm pretty much lost.
There's a parse tree underneath. R is being helpful and deparsing it
for you for display purposes.
To see it as a list, use "as.list" to strip off the mode, e.g.
> as.list(parse(text="1+2"))
[[1]]
1 + 2
# A list containing one expression. Expand it:
> as.list(parse(text="1+2")[[1]])
[[1]]
`+`
[[2]]
[1] 1
[[3]]
[1] 2
# A function call to `+` with two arguments. The arguments are atomic.
Use "mode" to work out how these are interpreted:
> mode(parse(text="1+2"))
[1] "expression"
> mode(parse(text="1+2")[[1]])
[1] "call"
>
> Also, what happens at package build time? I know that R CMD INSTALL
> generates binary *.rdb and *.rdx files for my package, but what do
> those do exactly, and how do they relate to the REPL and R_Parse1()?
>
> Finally, are there any docs describing the design and implementation
> of the R internals? Should I be looking anywhere other than the R
> developer page here?:
The source code is sometimes the best place for low level details like
this. The R Language manual sometimes gives low level details, but is
is uneven in its coverage; I forget if it covers this.
Duncan Murdoch
More information about the R-devel
mailing list