[Rd] New pipe operator

Duncan Murdoch murdoch@dunc@n @end|ng |rom gm@||@com
Wed Dec 9 17:08:45 CET 2020


On 09/12/2020 10:42 a.m., Jan van der Laan wrote:
> 
> 
> 
> 
> On 09-12-2020 16:20, Duncan Murdoch wrote:
>> On 09/12/2020 9:55 a.m., Jan van der Laan wrote:
>>
>>>
>>> I think only allowing functions on the right hand side (e.g. only the |>
>>> operator and not the |:>) would be enough to handle most cases and seems
>>> easier to reason about. The limitations of that can easily be worked
>>> around using existing functionality in the language.
>>
>> I agree that would be sufficient, but I don't see how it makes reasoning
>> easier.  The transformation is trivial, so I'll assume that doesn't
>> consume any mental energy compared to understanding what the final
>> expression actually does.  Using your currying example, the choice is
>> between
>>
>>    x |> mean(na.rm = TRUE)
>>
>> which transforms to mean(x, na.rm = TRUE), or your proposed
>>
>>    x |> curry(mean, na.rm = TRUE)
>>
>> which transforms to
>>
>>    curry(mean, na.rm = TRUE)(x)
>>
>> To me curry(mean, na.rm = TRUE)(x) looks a lot more complicated than
>> mean(x, na.rm = TRUE), especially since it has the additional risk that
>> users can define their own function called "curry".
> 
> 
> First, I do agree that
> 
> x |> mean(na.rm = TRUE)
> 
> is cleaner and this covers most of the use cases of users and many users
> are used to the syntax from the magritr pipes.
> 
> However, for programmers (there is not distinct line between users and
> programmers), it is simpler to reason in the sense that lhs |> rhs
> always mean rhs(lhs); this does not depend on whether rhs is call or
> (anonymous) function (not sure what is called what; which perhaps
> illustrates the difficulty).

I think your proposed rule is pretty simple, with just one case:

lhs |> rhs

would transform to rhs(lhs).  Yes, that's simple.

The current rule is not as simple as yours, but it only has two cases 
instead of 1.  Both involve the rhs being a call, nothing else.

Case 1, the common one:  rhs is a call to a function using regular 
syntax, e.g. f(args) where args might be empty.  Then it is transformed 
to f(lhs, args).

Case 2:  rhs is a call to `function`, which we normally write as 
"function(args) body", which is transformed to (function(args) body)(lhs).

That's it!  Nothing else is allowed.  Not as simple as yours, but simple 
enough to be trivial to reason about.  Most of the effort would be spent 
in figuring out how the transformed expression would evaluate, and since 
your transformed expression is more complicated in the common case where 
currying is needed, I prefer the current proposal.


> 
> As soon as you start to have functions returning functions, you have to
> think about how many brackets you have to place where. Being able to use
> functions returning functions does open up possibilities for
> programmers, as illustrated for example in my example using expressions.
> This would have been much less clear.

I think your examples would work in the current system, too, with a 
small change to fexpr.  A corresponding change to curry could be made, 
but then it wouldn't be doing currying, so I won't do that.  Here's your 
example rewritten in the R-devel system:

fexpr <- function(x, expr){
   expr <- substitute(expr)
   f <- function(.) {}
   body(f) <- expr
   f(x)
}
. <- fexpr


1:10 |> mean()
c(1,3,NA) |> mean(na.rm = TRUE)
c(1,3,NA) |> .( mean(., na.rm = TRUE) ) |> identity()
c(1,3,NA) |> .( . + 4)
c(1,3,NA) |> fexpr( . + 4)
c(1,3,NA) |> function(x) mean(x, na.rm = TRUE) |> fexpr(. + 1)

That produces the same outputs as your code.

Duncan Murdoch


> The argument of users begin able to redefine curry. Yes they can and
> this is perhaps a good thing. They can also redefine a lot of other
> stuff. And I am not suggesting that curry or fexpr or . are good names.
> You could even have a curry operator.
> 
> Best,
> Jan
> 
> 
> 
> 
> 
>>
>> Duncan Murdoch
>>
>>>
>>> The problem with only allowing
>>>
>>> x |> mean
>>>
>>> and not
>>>
>>> x |> mean()
>>>
>>> is with additional arguments. However, this can be solved with a
>>> currying function, for example:
>>>
>>> x |> curry(mean, na.rm = TRUE)
>>>
>>> The cost is a few additional characters.
>>>
>>> In the same way it is possible to write a function that accepts an
>>> expression and returns a function containing that expression. This can
>>> be used to have expressions on the right-hand side and reduces the need
>>> for anonymous functions.
>>>
>>> x |> fexpr(. + 10)
>>> dta |> fexpr(lm(y ~ x, data = .))
>>>
>>> You could call this function .:
>>>
>>> x |> .(. + 10)
>>> dta |> .(lm(y ~ x, data = .))
>>>
>>>
>>> Dummy example code (thanks to  a colleague of mine)
>>>
>>>
>>> fexpr <- function(expr){
>>>      expr <- substitute(expr)
>>>      f <- function(.) {}
>>>      body(f) <- expr
>>>      f
>>> }
>>> . <- fexpr
>>>
>>> curry <- function(fun,...){
>>>      L <- list(...)
>>>      function(...){
>>>        do.call(fun, c(list(...),L))
>>>      }
>>> }
>>>
>>> `%|>%` <- function(e1, e2) {
>>>      e2(e1)
>>> }
>>>
>>>
>>> 1:10 %>% mean
>>> c(1,3,NA) %|>% curry(mean, na.rm = TRUE)
>>> c(1,3,NA) %|>% .( mean(., na.rm = TRUE) ) %>% identity
>>> c(1,3,NA) %|>% .( . + 4)
>>> c(1,3,NA) %|>% fexpr( . + 4)
>>> c(1,3,NA) %|>% function(x) mean(x, na.rm = TRUE) %>% fexpr(. + 1)
>>>
>>> -- 
>>> Jan
>>>
>>> ______________________________________________
>>> R-devel using r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>



More information about the R-devel mailing list