[Rd] RFC: (in-principle) native unquoting for standard evaluation

Fri Mar 17 15:46:31 CET 2017

Jonathan,

Nice proposal.

I think these two uses for unary @ ( your initial @ unary operator and
Michael's extension for use inside function declaration) synergize really
well. It could easily be that function owners can declare an parameter to
always quote, and function callers can their specific arguments to behave
in the way you describe. It would make @ mean two pretty different things
in these two contexts, but they aren't^ mixable, so I think that would be
ok. This also has a strong precedence with the * operator in C, where int
*a creates a pointer, and then *a +1 uses the dereferenced value.

^ I think they're only not mixable provided that the function function
itself does not support your (Jonathan's) version of the operator, i.e.,
the ability to use variables' values to declare parameter names  or default
values within the function declaration. (Actually I think it could be
supported for default values, just not parameter names, if we wanted to) I
think that's reasonable though. I don't think we would need to support that.

One big question is whether you can do function(x, y, @...). The definition
of mutate() using Michaels extension of your proposal would require this.
This would be in keeping with the principle of the proposal, I think,
though it might (or might not) make the implementation more complicated.

I wonder if it makes sense to have a formal ability to declare where the
NSE will take place in the function definition, perhaps, (completely
spitballing) a unary ^ operator, so a simplified subset could literally be
defined as

subset2 = function(^x,  @cond) x[cond,]

Perhaps that's getting too clever, but it could be cool. Note it would be
optional. And we might even want a different different operators for that,
since it changes what the @ modifier of the parameter does. (your code gets
the result of the expression being evaluated in the ^ context, rather than
the language object). This would be, I imagine, immensely useful when
attempting to compile code that is NSE, even beyond labeling it as such via
the @ in function declarations

Best,
~G

On Fri, Mar 17, 2017 at 6:16 AM, Jonathan Carroll <jono at jcarroll.com.au>
wrote:

> I love the pointer analogy. Presumably the additional complication of scope
> breaks this however. * itself would have been a nice operator for this were
> it not prone to ambiguity (`a * *b` vs `a**b`, from which @ does not
> suffer).
>
> Would this extension require that function authors explicitly enable
> auto-quoting support? I somewhat envisioned functions seeing the resolved
> unquoted object (within their calling scope) so that they could retain
> their standard defintions when not using @. In my mutate example, mutate
> itself could simply be the NSE version, so
>
>     mutate(mtcars, z = mpg)
>
> would work as normal, but
>
>     x = "mpg"
>     mutate(mtcars, z = @x)
>
> would produce the same result (x may be changing within a loop or be
> defined through a formal argument). Here, @x would resolve to `mpg` and
> mutate would retain the duty of resolving that to mtcars$mpg as per normal.
>
> A seperate SE version would not be required (as arguments could be set
> programatically), but an additional flexibility could be @ acting on a
> string rather than an object for direct unquoting
>
>     mutate(mtcars, z = @"mpg")
>
> for when the name is known but NSE isn't desired (which would also assist
> with the whole utils::globalVariables() vs CRAN checks concern).
>
> Having a formal argument forcefully auto-unquote would prevent standard
> usage unless there was a way to also disable it. Unless I'm missing an
> angle (which I very likely am) wouldn't it be better to have the user
> supply an @-prefixed argument and retain the connection to the calling
> scope?
>
> Apologies if I have any of that confused or there are better approaches. I
> merely have a desire for this to work and am learning as much as possible
> about "how" as I go.
>
> Your comments are greatly appreciated.
>
> - Jonathan.
>
> On Fri, 17 Mar 2017 at 21:00, Michael Lawrence <lawrence.michael at gene.com>
> wrote:
>
> Interesting idea. Lazy and non-standard evaluation is going to happen; the
> language needs a way to contain it.
>
> I'll extend the proposal so that prefixing a formal argument with @ in
> function() marks the argument as auto-quoting, so it arrives as a language
> object without use of substitute(). Kind of like how '*' in C declares a
> pointer and dereferences one.
>
> subset <- function(x, @subset, ...) { }
>
> This should make it easier to implement such functions, simplify
> compilation, and allow detection of potential quoting errors through static
> analysis.
>
> Michael
>
> On Thu, Mar 16, 2017 at 5:03 PM, Jonathan Carroll <jono at jcarroll.com.au>
> wrote:
>
> (please be gentle, it's my first time)
>
> I am interested in discussions (possibly reiterating past threads --
> searching didn't turn up much) on the possibility of supporting standard
> evaluation unquoting at the language level. This has been brought up in a
> recent similar thread here [1] and on Twitter [2] where I proposed the
> following desired (in-principle) syntax
>
>     f <- function(col1, col2, new_col_name) {
>         mtcars %>% mutate(@new_col_name = @col1 + @col2)
>     }
>
> or closer to home
>
>     x <- 1:10; y <- "x"
>     data.frame(z = @y)
>
> where @ would be defined as a unary prefix operator which substitutes the
> quoted variable name in-place, to allow more flexibility of NSE functions
> within a programming context. This mechanism exists within MySQL [3] (and
> likely other languages) and could potentially be extremely useful. Several
> alternatives have been incorporated into packages (most recently work
> on tidyeval) none of which appear to fully match the simplicity of the
> above, and some of which cut a forceful path through the syntax tree.
>
> The exact syntax isn't my concern at the moment (@ vs unquote() or other,
> though the first requires user-supplied native prefix support within the
> language, as per [1]) and neither is the exact way in which this would be
> achieved (well above my pay grade). The practicality of @ being on the LHS
> of `=` is also of a lesser concern (likely greater complexity) than the
> RHS.
>
> I hear there exists (justified) reluctance to add new syntax to the
> language, but I think this has sufficient merit (and a growing number of
> workarounds) to warrant continued discussion.
>
> With kindest regards,
>
> - Jonathan.
>
> [1] https://stat.ethz.ch/pipermail/r-devel/2017-March/073894.html
> [2] https://twitter.com/carroll_jono/status/842142292253196290
> [3] https://dev.mysql.com/doc/refman/5.7/en/user-variables.html
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Gabriel Becker, PhD
Associate Scientist (Bioinformatics)
Genentech Research

	[[alternative HTML version deleted]]