[Rd] evaluation in transform versus within
murdoch.duncan at gmail.com
Wed Apr 1 21:18:39 CEST 2015
On 01/04/2015 2:33 PM, Joris Meys wrote:
> Thank you for the insights. I understood as much from the code, but I
> can't really see how this can cause a problem when using with() or
> within() within a package or a function. The environments behave like
> I would expect, as does the evaluation of the arguments. The second
> argument is supposed to be an expression, so I would expect that
> expression to be evaluated in the data frame first.
I don't know the context within which you were told that they are
problematic, but one issue is that it makes typo detection harder, since
the code analysis won't see typos.
df <- data.frame(col1 = 1)
global <- 3
with(df, col1 + global) # fine
with(df, col1 + Global) # typo, but still no warning
df$col1 + global # fine
df$col1 + Global # "no visible binding for global variable 'Global'"
and of course you'll get in a real mess later with the with() code if
you add a column named "global" to your dataframe.
> I believed the warning in subset() and transform() refers to the
> consequences of using the dotted argument and the evaluation thereof
> inside the function, but I might have misunderstood this. I've always
> considered within() the programming equivalent of the convenience
> function transform().
> Sorry for using the r-devel list, but I reckoned this could have
> consequences for package developers like me. More explicitly: if
> within() poses the same risk as transform() (which I'm still not sure
> of), a warning on the help page of within() would be suited imho. I
> will use the r-help list in the future.
> Kind regards
> On Wed, Apr 1, 2015 at 7:55 PM, Duncan Murdoch
> <murdoch.duncan at gmail.com <mailto:murdoch.duncan at gmail.com>> wrote:
> On 01/04/2015 1:35 PM, Gabriel Becker wrote:
> The second argument to evalq is envir, so that line says,
> roughly, "call
> environment() to generate me a new environment within the
> defined by data".
> I think that's not quite right. environment() returns the current
> environment, it doesn't create a new one. It is evalq() that
> created a new environment from data, and environment() just
> returns it.
> Here's what happens. I've put the code first, the description of
> what happens on the line below.
> parent <- parent.frame()
> Get the environment from which within.data.frame was called.
> e <- evalq(environment(), data, parent)
> Create a new environment containing the columns of data, with the
> parent being the environment where we were called.
> Return it and store it in e.
> eval(substitute(expr), e)
> Evaluate the expression in this new environment.
> l <- as.list(e)
> Convert it to a list.
> l <- l[!vapply(l, is.null, NA, USE.NAMES = FALSE)]
> Delete NULL entries from the list.
> nD <- length(del <- setdiff(names(data), (nl <- names(l))))
> Find out if any columns were deleted.
> data[nl] <- l
> Set the columns of data to the values from the list.
> if (nD)
> data[del] <- if (nD == 1)
> else vector("list", nD)
> Delete the columns from data which were deleted from the list.
> Note that that is is only generating e, the environment that
> expr will be
> evaluated within in the next line (the call to eval). This
> means that expr
> is evaluated in an environment which is inside the environment
> defined by
> data, so you get non-standard evaluation in that symbols
> defined in data
> will be available to expr earlier in symbol lookup than those
> in the
> environment that within() was called from.
> This again sounds like there are two environments created, when
> really there's just one, but the last part is correct.
> Duncan Murdoch
> This is easy to confirm from the behavior of these functions:
> > df = data.frame(x = 1:10, y = rnorm(10))
> > x = "I'm a character"
> > mean(x)
>  NA
> Warning message:
> In mean.default(x) : argument is not numeric or logical:
> returning NA
> > within(df, mean.x <- mean(x))
> x y mean.x
> 1 1 0.396758869 5.5
> 2 2 0.945679050 5.5
> 3 3 1.980039723 5.5
> 4 4 -0.187059706 5.5
> 5 5 0.008220067 5.5
> 6 6 0.451175885 5.5
> 7 7 -0.262064017 5.5
> 8 8 -0.652301191 5.5
> 9 9 0.673609455 5.5
> 10 10 -0.075590905 5.5
> > with(df, mean(x))
>  5.5
> P.S. this is probably an r-help question.
> On Wed, Apr 1, 2015 at 10:21 AM, Joris Meys
> <jorismeys at gmail.com <mailto:jorismeys at gmail.com>> wrote:
> > Dear list members,
> > I'm a bit confused about the evaluation of expressions using
> with() or
> > within() versus subset() and transform(). I always teach my
> students to use
> > with() and within() because of the warning mentioned in the
> helppages of
> > subset() and transform(). Both functions use nonstandard
> evaluation and are
> > to be used only interactively.
> > I've never seen that warning on the help page of with() and
> within(), so I
> > assumed both functions can safely be used in functions and
> packages. I've
> > now been told that both functions pose the same risk as
> subset() and
> > transform().
> > Looking at the source code I've noticed the extra step:
> > e <- evalq(environment(), data, parent)
> > which, at least according to my understanding, should ensure
> that the
> > functions follow the standard evaluation rules. Could
> somebody with more
> > knowledge than I have shed a bit of light on this issue?
> > Thank you
> > Joris
> > --
> > Joris Meys
> > Statistical consultant
> > Ghent University
> > Faculty of Bioscience Engineering
> > Department of Mathematical Modelling, Statistics and
> > tel : +32 (0)9 264 61 79 <tel:%2B32%20%280%299%20264%2061%2079>
> > Joris.Meys at Ugent.be
> > -------------------------------
> > Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
> > [[alternative HTML version deleted]]
> > ______________________________________________
> > R-devel at r-project.org <mailto:R-devel at r-project.org> mailing
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> Joris Meys
> Statistical consultant
> Ghent University
> Faculty of Bioscience Engineering
> Department of Mathematical Modelling, Statistics and Bio-Informatics
> tel : +32 (0)9 264 61 79
> Joris.Meys at Ugent.be
> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
More information about the R-devel