[R] Concepts question: environment, frame, search path
Duncan Murdoch
murdoch at stats.uwo.ca
Tue May 1 13:16:26 CEST 2007
On 01/05/2007 12:29 AM, Graham Wideman wrote:
> Folks:
>
> I'd appreciate if someone could straighten me out on a few concepts which
> are described a bit ambiguously in the docs.
>
> 1. data.frame:
> ----------------
> Refan p84: 'A data frame is a list of variables of the same length with
> unique row names, given class "data.frame".'
>
> I probably don't need to point out how opaque that is!
Which manual are you looking at? The "reference index" (refman.pdf)? It
doesn't usually include statements like that; they are usually found in
the Introduction to R (R-intro.pdf) or the R Language Definition
(R-lang.pdf). But since the refman is just a collection of man pages,
it might be in there somewhere. And since the manuals do get updated,
that statement may not be present in the current release. (I did a
quick search of the source, and couldn't spot it, but my search might
have failed because of line breaks, strange formatting, or looking in
the wrong place.)
By the way, it's generally best to cite the section name where you found
a quote, because the pagination varies from system to system. Even
better would be to give a URL to the online HTML version at
http://cran.r-project.org/manuals.html.
For future reference, if you are suggesting a change, it's best to cite
the line number in the source at
https://svn.r-project.org/R/trunk/doc/manual in the *.texi files or
https://svn.r-project.org/R/trunk/src/library/*/man/*.Rd for man pages,
and send such suggestions to the R-devel list.
> Anyhow, key question: Some places in the docs seem pretty firm that a
> data.frame is basically a 2-D array with:
> a) named rows and
> b) columns whose items within a column be of uniform data type.
>
> Elsewhere, it seems like a data.frame can be a collection of arbitrary
> variables.
The former interpretation is correct. Since the variables all have the
same length, things like df[i, j] make sense: they choose the i'th
entry from the j'th variable (according to the "refan" definition), or
the i'th row, j'th column (according to the 2-D array interpretation.
>
> 2. environment
> ---------------
> Refman p122: "Environments consist of a frame, or collection of named
> objects, and a pointer to an enclosing environment."
>
> Is the "or" here explaining parenthetically that a frame is a collection of
> named objects, or is separating too alternative structures for an
> environment?
The former.
>
> If the former, does this imply that a frame can contain arbitrary variables?
Yes, but a frame isn't an R object, it's a concept that appears in
descriptions, e.g. part of an environment, or the local variables
created during function evaluation, etc.
>
> And "pointer"? Is that a type of thing in R?
No, there are no pointers in R. There are a couple of tricks to fake
them (e.g. environment objects aren't copied when assigned, you just get
a new reference to the same environment; this allows you to construct
something like a pointer by wrapping an object in an environment), but I
don't recommend using these routinely.
>
> 3. R search path; attach()
> ----------------------------
> The R search path appears to hold the list of "collections of data" (my
> term) that can be accessed by a users' commands. Refman p27 tells that
> search path can hold items that are data.frame, list, environment or R data
> file (on disk). Yet R-intro p28 describes attach() as taking a "directory
> name" argument. What is the concept "directory" in this context?
I haven't read the preceding pages carefully, but that looks like an
error. The usual argument to attach is a package name, and what gets
attached is an environment holding the exports from the package.
Packages are stored in directories in the file system, so maybe that's
what the author of that line had in mind.
Duncan Murdoch
More information about the R-help
mailing list