[Rd] suggestion: "." in [lsv]apply()

Sokol Serguei @oko| @end|ng |rom |n@@-tou|ou@e@|r
Mon Apr 20 15:32:23 CEST 2020


Le 19/04/2020 à 20:46, Gabor Grothendieck a écrit :
> You can get pretty close to that already using fn$ in the gsubfn package:
>> library(gsubfn) fn$sapply(split(mtcars, mtcars$cyl), x ~ 
>> summary(lm(mpg ~ wt, x))$r.squared) 
> 4 6 8 0.5086326 0.4645102 0.4229655
Right, I thought about similar syntax but this implementation has 
similar flaws pointed by Simon, i.e. it reduces the domain of valid 
inputs (though not on the same parameters). Take an example:

library(gsubfn)
fn$sapply(quote(x+y), as.character)
#Error in lapply(X = X, FUN = FUN, ...) : object 'x' not found

while

sapply(quote(x+y), as.character)
#[1] "+" "x" "y"

This makes me think that it could be advantageous to replace 
match.fun(FUN) in *apply() family by as.function(FUN) with obvious 
additional methods:
as.function.character <- function(x) match.fun(x)
as.function.name <- function(x) match.fun(x)

Such replacement would leave current usage of *apply() as is but at the 
same time would leave enough space for users who want to adapt *apply() 
to their objects like formula or whatever class that is currently not 
convertible to functions by match.fun()

Would it be possible?

Best,
Serguei.

> It is not specific to sapply but rather fn$ can preface most 
> functions. If the only free variables are the arguments to the 
> function then you can omit the left hand side of the formula, i.e. the 
> arguments to the function are implied by the free variables in the 
> right hand side. Here x is the implied argument to the function 
> because it is a free variable. We did not have use the name x. Any 
> name could be used. It is the fact that it is a free variable, not its 
> name, that matters.
>> fn$sapply(split(mtcars, mtcars$cyl), ~ sum(dim(x))) 
> 4 6 8 22 18 25 On Fri, Apr 17, 2020 at 4:11 AM Sokol Serguei 
> <sokol using insa-toulouse.fr> wrote:
>> Thanks Simon, Now, I see better your argument. Le 16/04/2020 à 22:48, 
>> Simon Urbanek a écrit :
>>> ... I'm not arguing against the principle, I'm arguing about your 
>>> particular proposal as it is inconsistent and not general. 
>> This sounds promising for me. May be in a (new?) future, R core will 
>> come with a correct proposal for this principle? Meanwhile, to avoid 
>> substitute(), I'll look on the side of formula syntax deviation as 
>> your example x ~> i + x suggested. Best, Serguei.
>>> Personally, I find the current syntax much clearer and readable 
>>> (defining anything by convention like . being the function variable 
>>> seems arbitrary and "dirty" to me), but if you wanted to define a 
>>> shorter syntax, you could use something like x ~> i + x. That said, 
>>> I really don't see the value of not using function(x) [especially 
>>> these days when people are arguing for long variable names with the 
>>> justification that IDEs do all the work anyway], but as I said, my 
>>> argument was against the actual proposal, not general ideas about 
>>> syntax improvement. Cheers, Simon
>>>> On 17/04/2020, at 3:53 AM, Sokol Serguei <sokol using insa-toulouse.fr> 
>>>> wrote: Simon, Thanks for replying. In what follows I won't try to 
>>>> argue (I understood that you find this a bad idea) but I would like 
>>>> to make clearer some of your point for me (and may be for others). 
>>>> Le 16/04/2020 à 16:48, Simon Urbanek a écrit :
>>>>> Serguei,
>>>>>> On 17/04/2020, at 2:24 AM, Sokol Serguei <sokol using insa-toulouse.fr> 
>>>>>> wrote: Hi, I would like to make a suggestion for a small 
>>>>>> syntactic modification of FUN argument in the family of functions 
>>>>>> [lsv]apply(). The idea is to allow one-liner expressions without 
>>>>>> typing "function(item) {...}" to surround them. The argument to 
>>>>>> the anonymous function is simply referred as ".". Let take an 
>>>>>> example. With this new feature, the following call 
>>>>>> sapply(split(mtcars, mtcars$cyl), function(d) summary(lm(mpg ~ 
>>>>>> wt, d))$r.squared) # 4 6 8 #0.5086326 0.4645102 0.4229655 could 
>>>>>> be rewritten as sapply(split(mtcars, mtcars$cyl), summary(lm(mpg 
>>>>>> ~ wt, .))$r.squared) "Not a big saving in typing" you can say but 
>>>>>> multiplied by the number of [lsv]apply usage and a neater look, I 
>>>>>> think, the idea merits to be considered. 
>>>>> It's not in any way "neater", not only is it less readable, it's 
>>>>> just plain wrong. What if the expression returned a function? 
>>>> do you mean like in l=sapply(1:3, function(i) function(x) i+x) 
>>>> l[[1]](3) # 4 l[[2]](3) # 5 This is indeed a corner case but a pair 
>>>> of () or {} can keep wsapply() in course: l=wsapply(1:3, 
>>>> (function(x) .+x)) l[[1]](3) # 4 l[[2]](3) # 5
>>>>> How do you know that you don't want to apply the result of the call? 
>>>> A small example (if it is significantly different from the one 
>>>> above) would be very helpful for me to understand this point.
>>>>> For the same reason the implementation below won't work - very 
>>>>> often you just pass a symbol that evaluates to a function and 
>>>>> always en expression that returns a function and there is no way 
>>>>> to distinguish that from your new proposed syntax. 
>>>> Even with () or {} around such "dotted" expression? Best, Serguei.
>>>>> When you feel compelled to use substitute() you should hear alarm 
>>>>> bells that something is wrong ;). You can certainly write a new 
>>>>> function that uses a different syntax (and I'm sure someone has 
>>>>> already done that in the package space), but what you propose is 
>>>>> incompatible with *apply in R (and very much not R syntax). 
>>>>> Cheers, Simon
>>>>>> To illustrate a possible implementation, I propose a wrapper 
>>>>>> example for sapply(): wsapply=function(l, fun, ...) { 
>>>>>> s=substitute(fun) if (is.name(s) || is.call(s) && 
>>>>>> s[[1]]==as.name("function")) { sapply(l, fun, ...) # legacy call 
>>>>>> } else { sapply(l, function(d) eval(s, list(.=d)), ...) } } Now, 
>>>>>> we can do: wsapply(split(mtcars, mtcars$cyl), summary(lm(mpg ~ 
>>>>>> wt, .))$r.squared) or, traditional way: wsapply(split(mtcars, 
>>>>>> mtcars$cyl), function(d) summary(lm(mpg ~ wt, d))$r.squared) the 
>>>>>> both work. How do you feel about that? Best, Serguei. 
>>>>>> ______________________________________________ 
>>>>>> R-devel using r-project.org mailing list 
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel 
>> ______________________________________________ R-devel using r-project.org 
>> mailing list https://stat.ethz.ch/mailman/listinfo/r-devel 
>



More information about the R-devel mailing list