[Rd] New pipe operator

Duncan Murdoch murdoch@dunc@n @end|ng |rom gm@||@com
Mon Dec 7 11:41:16 CET 2020


On 06/12/2020 9:23 p.m., Gabriel Becker wrote:
> Hi Gabor,
> 
> On Sun, Dec 6, 2020 at 3:22 PM Gabor Grothendieck <ggrothendieck using gmail.com>
> wrote:
> 
>> I understand very well that it is implemented at the syntax level;
>> however, in any case the implementation is irrelevant to the principles.
>>
>> Here a similar example to the one I gave before but this time written out:
>>
>> This works:
>>
>>    3 |> function(x) x + 1
>>
>> but this does not:
>>
>>    foo <- function(x) x + 1
>>    3 |> foo
>>
>> so it breaks the principle of functions being first class objects.  foo
>> and its
>> definition are not interchangeable.
> 
> 
> I understood what you meant as well.
> 
> The issue is that neither foo nor its definition are being operated on, or
> even exist within the scope of what |> is defined to do. You are used to
> magrittr's %>% where arguably what you are saying would be true. But its
> not here, in my view.
> 
> Again, I think the issue is that |>, in as much as it "operates" on
> anything at all (it not being a function, regardless of appearances),
> operates on call expression objects, NOT on functions, ever.
> 
> function(x) x *parses to a call expression *as does RHSfun(), while RHSfun does
> not, it parses to a name, *regardless of whether that symbol will
> eventually evaluate to a closure or not.*
> 
> So in fact, it seems to me that, technically, all name symbols are being
> treated exactly the same (none are allowed, including those which will
> lookup to functions during evaluation), while all* call expressions are
> also being treated the same. And again, there are no functions anywhere in
> either case.

I agree it's all about call expressions, but they aren't all being 
treated equally:

x |> f(...)

expands to f(x, ...), while

x |> `function`(...)

expands to `function`(...)(x).  This is an exception to the rule for 
other calls, but I think it's a justified one.

Duncan Murdoch

> 
> * except those that include that the parser flags as syntactically special.
> 
> 
>> You have
>> to write 3 |> foo() but don't have to write 3 |> (function(x) x + 1)().
>>
> 
> I think you should probably be careful what you wish for here. I'm not
> involved with this work and do not speak for any of those who were, but the
> principled way to make that consistent while remaining entirely in the
> parser seems very likely to be to require the latter, rather than not
> require the former.
> 
> 
>> This isn't just a matter of notation, i.e. foo vs foo(), but is a
>> matter of breaking
>> the way R works as a functional language with first class functions.
>>
> 
> I don't agree. Consider `+`
> 
> Having
> 
> foo <- get("+") ## note no `` here
> foo(x,y)
> 
> parse and work correctly while
> 
> +(x,y)
> 
>   does not does not mean + isn't a function or that it is a "second class
> citizen", it simply means that the parser has constraints on the syntax for
> writing code that calls it that calling other functions are not subject to.
> The fact that such *syntactic* constraints can exist proves that there is
> not some overarching inviolable principle being violated here, I think. Now
> you may say "well thats just the parser, it has to parse + specially
> because its an operator with specific precedence etc". Well, the same exact
> thing is true of |> I think.
> 
> Best,
> ~G
> 
>>
>> On Sun, Dec 6, 2020 at 4:06 PM Gabriel Becker <gabembecker using gmail.com>
>> wrote:
>>>
>>> Hi Gabor,
>>>
>>> On Sun, Dec 6, 2020 at 12:52 PM Gabor Grothendieck <
>> ggrothendieck using gmail.com> wrote:
>>>>
>>>> I think the real issue here is that functions are supposed to be
>>>> first class objects in R
>>>> or are supposed to be and |> would break that if if is possible
>>>> to write function(x) x + 1 on the RHS but not foo (assuming foo
>>>> was defined as that function).
>>>>
>>>> I don't think getting experience with using it can change that
>>>> inconsistency which seems serious to me and needs to
>>>> be addressed even if it complicates the implementation
>>>> since it drives to the heart of what R is.
>>>>
>>>
>>> With respect I think this is a misunderstanding of what is happening
>> here.
>>>
>>> Functions are first class citizens. |> is, for all intents and purposes,
>> a macro.
>>>
>>> LHS |> RHS(arg2=5)
>>>
>>> parses to
>>>
>>> RHS(LHS, arg2 = 5)
>>>
>>> There are no functions at the point in time when the pipe transformation
>> happens, because no code has been evaluated. To know if a symbol is going
>> to evaluate to a function requires evaluation which is a step entirely
>> after the one where the |> pipe is implemented.
>>>
>>> Another way to think about it is that
>>>
>>> LHS |> RHS(arg2 = 5)
>>>
>>> is another way of writing RHS(LHS, arg2 = 5), NOT R code that is (or
>> even can be) evaluated.
>>>
>>>
>>> Now this is a subtle point that only really has implications in as much
>> as it is not the case for magrittr pipes, but its relevant for discussions
>> like this, I think.
>>>
>>> ~G
>>>
>>>> On Sat, Dec 5, 2020 at 1:08 PM Gabor Grothendieck
>>>> <ggrothendieck using gmail.com> wrote:
>>>>>
>>>>> The construct utils::head  is not that common but bare functions are
>>>>> very common and to make it harder to use the common case so that
>>>>> the uncommon case is slightly easier is not desirable.
>>>>>
>>>>> Also it is trivial to write this which does work:
>>>>>
>>>>> mtcars %>% (utils::head)
>>>>>
>>>>> On Sat, Dec 5, 2020 at 11:59 AM Hugh Parsonage <
>> hugh.parsonage using gmail.com> wrote:
>>>>>>
>>>>>> I'm surprised by the aversion to
>>>>>>
>>>>>> mtcars |> nrow
>>>>>>
>>>>>> over
>>>>>>
>>>>>> mtcars |> nrow()
>>>>>>
>>>>>> and I think the decision to disallow the former should be
>>>>>> reconsidered.  The pipe operator is only going to be used when the
>> rhs
>>>>>> is a function, so there is no ambiguity with omitting the
>> parentheses.
>>>>>> If it's disallowed, it becomes inconsistent with other treatments
>> like
>>>>>> sapply(mtcars, typeof) where sapply(mtcars, typeof()) would just be
>>>>>> noise.  I'm not sure why this decision was taken
>>>>>>
>>>>>> If the only issue is with the double (and triple) colon operator,
>> then
>>>>>> ideally `mtcars |> base::head` should resolve to
>> `base::head(mtcars)`
>>>>>> -- in other words, demote the precedence of |>
>>>>>>
>>>>>> Obviously (looking at the R-Syntax branch) this decision was
>>>>>> considered, put into place, then dropped, but I can't see why
>>>>>> precisely.
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>>
>>>>>> Hugh.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sat, 5 Dec 2020 at 04:07, Deepayan Sarkar <
>> deepayan.sarkar using gmail.com> wrote:
>>>>>>>
>>>>>>> On Fri, Dec 4, 2020 at 7:35 PM Duncan Murdoch <
>> murdoch.duncan using gmail.com> wrote:
>>>>>>>>
>>>>>>>> On 04/12/2020 8:13 a.m., Hiroaki Yutani wrote:
>>>>>>>>>>    Error: function '::' not supported in RHS call of a pipe
>>>>>>>>>
>>>>>>>>> To me, this error looks much more friendly than magrittr's
>> error.
>>>>>>>>> Some of them got too used to specify functions without ().
>> This
>>>>>>>>> is OK until they use `::`, but when they need to use it, it
>> takes
>>>>>>>>> hours to figure out why
>>>>>>>>>
>>>>>>>>> mtcars %>% base::head
>>>>>>>>> #> Error in .::base : unused argument (head)
>>>>>>>>>
>>>>>>>>> won't work but
>>>>>>>>>
>>>>>>>>> mtcars %>% head
>>>>>>>>>
>>>>>>>>> works. I think this is a too harsh lesson for ordinary R
>> users to
>>>>>>>>> learn `::` is a function. I've been wanting for magrittr to
>> drop the
>>>>>>>>> support for a function name without () to avoid this
>> confusion,
>>>>>>>>> so I would very much welcome the new pipe operator's behavior.
>>>>>>>>> Thank you all the developers who implemented this!
>>>>>>>>
>>>>>>>> I agree, it's an improvement on the corresponding magrittr
>> error.
>>>>>>>>
>>>>>>>> I think the semantics of not evaluating the RHS, but treating
>> the pipe
>>>>>>>> as purely syntactical is a good decision.
>>>>>>>>
>>>>>>>> I'm not sure I like the recommended way to pipe into a
>> particular argument:
>>>>>>>>
>>>>>>>>     mtcars |> subset(cyl == 4) |> \(d) lm(mpg ~ disp, data = d)
>>>>>>>>
>>>>>>>> or
>>>>>>>>
>>>>>>>>     mtcars |> subset(cyl == 4) |> function(d) lm(mpg ~ disp,
>> data = d)
>>>>>>>>
>>>>>>>> both of which are equivalent to
>>>>>>>>
>>>>>>>>     mtcars |> subset(cyl == 4) |> (function(d) lm(mpg ~ disp,
>> data = d))()
>>>>>>>>
>>>>>>>> It's tempting to suggest it should allow something like
>>>>>>>>
>>>>>>>>     mtcars |> subset(cyl == 4) |> lm(mpg ~ disp, data = .)
>>>>>>>
>>>>>>> Which is really not that far off from
>>>>>>>
>>>>>>> mtcars |> subset(cyl == 4) |> \(.) lm(mpg ~ disp, data = .)
>>>>>>>
>>>>>>> once you get used to it.
>>>>>>>
>>>>>>> One consequence of the implementation is that it's not clear how
>>>>>>> multiple occurrences of the placeholder would be interpreted. With
>>>>>>> magrittr,
>>>>>>>
>>>>>>> sort(runif(10)) %>% ecdf(.)(.)
>>>>>>> ## [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
>>>>>>>
>>>>>>> This is probably what you would expect, if you expect it to work
>> at all, and not
>>>>>>>
>>>>>>> ecdf(sort(runif(10)))(sort(runif(10)))
>>>>>>>
>>>>>>> There would be no such ambiguity with anonymous functions
>>>>>>>
>>>>>>> sort(runif(10)) |> \(.) ecdf(.)(.)
>>>>>>>
>>>>>>> -Deepayan
>>>>>>>
>>>>>>>> which would be expanded to something equivalent to the other
>> versions:
>>>>>>>> but that makes it quite a bit more complicated.  (Maybe _ or \.
>> should
>>>>>>>> be used instead of ., since those are not legal variable names.)
>>>>>>>>
>>>>>>>> I don't think there should be an attempt to copy magrittr's
>> special
>>>>>>>> casing of how . is used in determining whether to also include
>> the
>>>>>>>> previous value as first argument.
>>>>>>>>
>>>>>>>> Duncan Murdoch
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>> Hiroaki Yutani
>>>>>>>>>
>>>>>>>>> 2020年12月4日(金) 20:51 Duncan Murdoch <murdoch.duncan using gmail.com
>>> :
>>>>>>>>>>
>>>>>>>>>> Just saw this on the R-devel news:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> R now provides a simple native pipe syntax ‘|>’ as well as a
>> shorthand
>>>>>>>>>> notation for creating functions, e.g. ‘\(x) x + 1’ is parsed
>> as
>>>>>>>>>> ‘function(x) x + 1’. The pipe implementation as a syntax
>> transformation
>>>>>>>>>> was motivated by suggestions from Jim Hester and Lionel
>> Henry. These
>>>>>>>>>> features are experimental and may change prior to release.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> This is a good addition; by using "|>" instead of "%>%"
>> there should be
>>>>>>>>>> a chance to get operator precedence right.  That said, the
>> ?Syntax help
>>>>>>>>>> topic hasn't been updated, so I'm not sure where it fits in.
>>>>>>>>>>
>>>>>>>>>> There are some choices that take a little getting used to:
>>>>>>>>>>
>>>>>>>>>>    > mtcars |> head
>>>>>>>>>> Error: The pipe operator requires a function call or an
>> anonymous
>>>>>>>>>> function expression as RHS
>>>>>>>>>>
>>>>>>>>>> (I need to say mtcars |> head() instead.)  This sometimes
>> leads to error
>>>>>>>>>> messages that are somewhat confusing:
>>>>>>>>>>
>>>>>>>>>>    > mtcars |> magrittr::debug_pipe |> head
>>>>>>>>>> Error: function '::' not supported in RHS call of a pipe
>>>>>>>>>>
>>>>>>>>>> but
>>>>>>>>>>
>>>>>>>>>> mtcars |> magrittr::debug_pipe() |> head()
>>>>>>>>>>
>>>>>>>>>> works.
>>>>>>>>>>
>>>>>>>>>> Overall, I think this is a great addition, though it's going
>> to be
>>>>>>>>>> disruptive for a while.
>>>>>>>>>>
>>>>>>>>>> Duncan Murdoch
>>>>>>>>>>
>>>>>>>>>> ______________________________________________
>>>>>>>>>> R-devel using r-project.org mailing list
>>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>>>>>
>>>>>>>>> ______________________________________________
>>>>>>>>> R-devel using r-project.org mailing list
>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>>>>>
>>>>>>>>
>>>>>>>> ______________________________________________
>>>>>>>> R-devel using r-project.org mailing list
>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>>>
>>>>>>> ______________________________________________
>>>>>>> R-devel using r-project.org mailing list
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-devel using r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Statistics & Software Consulting
>>>>> GKX Group, GKX Associates Inc.
>>>>> tel: 1-877-GKX-GROUP
>>>>> email: ggrothendieck at gmail.com
>>>>
>>>>
>>>>
>>>> --
>>>> Statistics & Software Consulting
>>>> GKX Group, GKX Associates Inc.
>>>> tel: 1-877-GKX-GROUP
>>>> email: ggrothendieck at gmail.com
>>>>
>>>> ______________________________________________
>>>> R-devel using r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>>
>> --
>> Statistics & Software Consulting
>> GKX Group, GKX Associates Inc.
>> tel: 1-877-GKX-GROUP
>> email: ggrothendieck at gmail.com
>>
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



More information about the R-devel mailing list