Get Detailed Parse Information from Object


If the "keep.source" option is TRUE, R's parser will attach detailed information on the object it has parsed. These functions retrieve that information.


getParseData(x, includeText = NA)
getParseText(parseData, id)



an expression returned from parse, or a function or other object with source reference information


logical; whether to include the text of parsed items in the result


a data frame returned from getParseData


a vector of item identifiers whose text is to be retrieved


In version 3.0.0, the R parser was modified to include code written by Romain Francois in his parser package. This constructs a detailed table of information about every token and higher level construct in parsed code. This table is stored in the srcfile record associated with source references in the parsed code, and retrieved by the getParseData function.


For getParseData:
If parse data is not present, NULL. Otherwise a data frame is returned, containing the following columns:


integer. The line number where the item starts. This is the parsed line number called "parse" in getSrcLocation, which ignores ⁠#line⁠ directives.


integer. The column number where the item starts. The first character is column 1. This corresponds to "column" in getSrcLocation.


integer. The line number where the item ends.


integer. The column number where the item ends.


integer. An identifier associated with this item.


integer. The id of the parent of this item.


character string. The type of the token.


logical. Whether the token is “terminal”, i.e. a leaf in the parse tree.


character string. If includeText is TRUE, the text of all tokens; if it is NA (the default), the text of terminal tokens. If includeText == FALSE, this column is not included. Very long strings (with source of 1000 characters or more) will not be stored; a message giving their length and delimiter will be included instead.

The rownames of the data frame will be equal to the id values, and the data frame will have a "srcfile" attribute containing the srcfile record which was used. The rows will be ordered by starting position within the source file, with parent items occurring before their children.

For getParseText:
A character vector of the same length as id containing the associated text items. If they are not included in parseData, they will be retrieved from the original file.


There are a number of differences in the results returned by getParseData relative to those in the original parser code:

Parse data by design explore details of the parser implementation, which are subject to change without notice. Applications computing on the parse data may require updates for each R release.


Duncan Murdoch


fn <- function(x) {
  x + 1 # A comment, kept as part of the source

d <- getParseData(fn)
if (!is.null(d)) {
  plus <- which(d$token == "'+'")
  sum <- d$parent[plus]
  print(getParseText(d, sum))

