R: Get Detailed Parse Information from Object

getParseData {utils}

R Documentation

Get Detailed Parse Information from Object

Description

If the "keep.source" option is TRUE, R's parser will attach detailed information on the object it has parsed. These functions retrieve that information.

Usage

getParseData(x, includeText = NA)
getParseText(parseData, id)

Arguments

x

an expression returned from parse, or a function or other object with source reference information

includeText

logical; whether to include the text of parsed items in the result

parseData

a data frame returned from getParseData

id

a vector of item identifiers whose text is to be retrieved

Details

In version 3.0.0, the R parser was modified to include code written by Romain Francois in his parser package. This constructs a detailed table of information about every token and higher level construct in parsed code. This table is stored in the srcfile record associated with source references in the parsed code, and retrieved by the getParseData function.

Value

For getParseData:
If parse data is not present, NULL. Otherwise a data frame is returned, containing the following columns:

line1

integer. The line number where the item starts. This is the parsed line number called "parse" in getSrcLocation, which ignores ⁠#line⁠ directives.

col1

integer. The column number where the item starts. The first character is column 1. This corresponds to "column" in getSrcLocation.

line2

integer. The line number where the item ends.

col2

integer. The column number where the item ends.

id

integer. An identifier associated with this item.

parent

integer. The id of the parent of this item.

token

character string. The type of the token.

terminal

logical. Whether the token is “terminal”, i.e. a leaf in the parse tree.

text

character string. If includeText is TRUE, the text of all tokens; if it is NA (the default), the text of terminal tokens. If includeText == FALSE, this column is not included. Very long strings (with source of 1000 characters or more) will not be stored; a message giving their length and delimiter will be included instead.

The rownames of the data frame will be equal to the id values, and the data frame will have a "srcfile" attribute containing the srcfile record which was used. The rows will be ordered by starting position within the source file, with parent items occurring before their children.

For getParseText:
A character vector of the same length as id containing the associated text items. If they are not included in parseData, they will be retrieved from the original file.

Note

There are a number of differences in the results returned by getParseData relative to those in the original parser code:

Fewer columns are kept.
The internal token number is not returned.
col1 starts counting at 1, not 0.
The id values are not attached to the elements of the parse tree, they are only retained in the table returned by getParseData.
⁠#line⁠ directives are identified, but other comment markup (e.g., roxygen2 comments) are not.

Parse data by design explore details of the parser implementation, which are subject to change without notice. Applications computing on the parse data may require updates for each R release.

Author(s)

Duncan Murdoch

Examples

fn <- function(x) {
  x + 1 # A comment, kept as part of the source
}

d <- getParseData(fn)
if (!is.null(d)) {
  plus <- which(d$token == "'+'")
  sum <- d$parent[plus]
  print(d[as.character(sum),])
  print(getParseText(d, sum))
}

[Package utils version 4.6.0 Index]