[Rd] evaluation in transform versus within
murdoch.duncan at gmail.com
Wed Apr 1 19:55:26 CEST 2015
On 01/04/2015 1:35 PM, Gabriel Becker wrote:
> The second argument to evalq is envir, so that line says, roughly, "call
> environment() to generate me a new environment within the environment
> defined by data".
I think that's not quite right. environment() returns the current
environment, it doesn't create a new one. It is evalq() that created a
new environment from data, and environment() just returns it.
Here's what happens. I've put the code first, the description of what
happens on the line below.
parent <- parent.frame()
Get the environment from which within.data.frame was called.
e <- evalq(environment(), data, parent)
Create a new environment containing the columns of data, with the parent
being the environment where we were called.
Return it and store it in e.
Evaluate the expression in this new environment.
l <- as.list(e)
Convert it to a list.
l <- l[!vapply(l, is.null, NA, USE.NAMES = FALSE)]
Delete NULL entries from the list.
nD <- length(del <- setdiff(names(data), (nl <- names(l))))
Find out if any columns were deleted.
data[nl] <- l
Set the columns of data to the values from the list.
data[del] <- if (nD == 1)
else vector("list", nD)
Delete the columns from data which were deleted from the list.
> Note that that is is only generating e, the environment that expr will be
> evaluated within in the next line (the call to eval). This means that expr
> is evaluated in an environment which is inside the environment defined by
> data, so you get non-standard evaluation in that symbols defined in data
> will be available to expr earlier in symbol lookup than those in the
> environment that within() was called from.
This again sounds like there are two environments created, when really
there's just one, but the last part is correct.
> This is easy to confirm from the behavior of these functions:
> > df = data.frame(x = 1:10, y = rnorm(10))
> > x = "I'm a character"
> > mean(x)
>  NA
> Warning message:
> In mean.default(x) : argument is not numeric or logical: returning NA
> > within(df, mean.x <- mean(x))
> x y mean.x
> 1 1 0.396758869 5.5
> 2 2 0.945679050 5.5
> 3 3 1.980039723 5.5
> 4 4 -0.187059706 5.5
> 5 5 0.008220067 5.5
> 6 6 0.451175885 5.5
> 7 7 -0.262064017 5.5
> 8 8 -0.652301191 5.5
> 9 9 0.673609455 5.5
> 10 10 -0.075590905 5.5
> > with(df, mean(x))
>  5.5
> P.S. this is probably an r-help question.
> On Wed, Apr 1, 2015 at 10:21 AM, Joris Meys <jorismeys at gmail.com> wrote:
> > Dear list members,
> > I'm a bit confused about the evaluation of expressions using with() or
> > within() versus subset() and transform(). I always teach my students to use
> > with() and within() because of the warning mentioned in the helppages of
> > subset() and transform(). Both functions use nonstandard evaluation and are
> > to be used only interactively.
> > I've never seen that warning on the help page of with() and within(), so I
> > assumed both functions can safely be used in functions and packages. I've
> > now been told that both functions pose the same risk as subset() and
> > transform().
> > Looking at the source code I've noticed the extra step:
> > e <- evalq(environment(), data, parent)
> > which, at least according to my understanding, should ensure that the
> > functions follow the standard evaluation rules. Could somebody with more
> > knowledge than I have shed a bit of light on this issue?
> > Thank you
> > Joris
> > --
> > Joris Meys
> > Statistical consultant
> > Ghent University
> > Faculty of Bioscience Engineering
> > Department of Mathematical Modelling, Statistics and Bio-Informatics
> > tel : +32 (0)9 264 61 79
> > Joris.Meys at Ugent.be
> > -------------------------------
> > Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
> > [[alternative HTML version deleted]]
> > ______________________________________________
> > R-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
More information about the R-devel