[Rd] Multiple Assignment built into the R Interpreter?

Sun Mar 12 11:06:33 CET 2023

I really like it!  Nicely done.

Duncan Murdoch

On 11/03/2023 6:00 p.m., Kevin Ushey wrote:
> FWIW, it's possible to get fairly close to your proposed semantics
> using the existing metaprogramming facilities in R. I put together a
> prototype package here to demonstrate:
> 
>      https://github.com/kevinushey/dotty
> 
> The package exports an object called `.`, with a special `[<-.dot` S3
> method which enables destructuring assignments. This means you can
> write code like:
> 
>      .[nr, nc] <- dim(mtcars)
> 
> and that will define 'nr' and 'nc' as you expect.
> 
> As for R CMD check warnings, you can suppress those through the use of
> globalVariables(), and that can also be automated within the package.
> The 'dotty' package includes a function 'dotify()' which automates
> looking for such usages in your package, and calling globalVariables()
> so that R CMD check doesn't warn. In theory, a similar technique would
> be applicable to other packages defining similar operators (zeallot,
> collapse).
> 
> Obviously, globalVariables() is a very heavy hammer to swing for this
> issue, but you might consider the benefits worth the tradeoffs.
> 
> Best,
> Kevin
> 
> On Sat, Mar 11, 2023 at 2:53 PM Duncan Murdoch <murdoch.duncan using gmail.com> wrote:
>>
>> On 11/03/2023 4:42 p.m., Sebastian Martin Krantz wrote:
>>> Thanks Duncan and Ivan for the careful thoughts. I'm not sure I can
>>> follow all aspects you raised, but to give my limited take on a few:
>>>
>>>> your proposal violates a very basic property of the  language, i.e. that all statements are expressions and have a value.  > What's the value of 1 + (A, C = init_matrices()).
>>>
>>> I'm not sure I see the point here. I evaluated 1 + (d = dim(mtcars); nr
>>> = d[1]; nc = d[2]; rm(d)), which simply gives a syntax error,
>>
>>
>>     d = dim(mtcars); nr = d[1]; nc = d[2]; rm(d)
>>
>> is not a statement, it is a sequence of 4 statements.
>>
>> Duncan Murdoch
>>
>>    as the
>>> above expression should. `%=%` assigns to
>>> environments, so 1 + (c("A", "C") %=% init_matrices()) returns
>>> numeric(0), with A and C having their values assigned.
>>>
>>>> suppose f() returns list(A = 1, B = 2) and I do  > B, A <- f() > Should assignment be by position or by name?
>>>
>>> In other languages this is by position. The feature is not meant to
>>> replace list2env(), and being able to rename objects in the assignment
>>> is a vital feature of codes
>>> using multi input and output functions e.g. in Matlab or Julia.
>>>
>>>> Honestly, given that this is simply syntactic sugar, I don't think I would support it.
>>>
>>> You can call it that, but it would be used by almost every R user almost
>>> every day. Simple things like nr, nc = dim(x); values, vectors =
>>> eigen(x) etc. where the creation of intermediate objects
>>> is cumbersome and redundant.
>>>
>>>> I see you've already mentioned it ("JavaScript-like"). I think it would  fulfil Sebastian's requirements too, as long as it is considered "true assignment" by the rest of the language.
>>>
>>> I don't have strong opinions about how the issue is phrased or
>>> implemented. Something like [t, n] = dim(x) might even be more clear.
>>> It's important though that assignment remains by position,
>>> so even if some output gets thrown away that should also be positional.
>>>
>>>>   A <- 0  > [A, B = A + 10] <- list(1, A = 2)
>>>
>>> I also fail to see the use of allowing this. something like this is an
>>> error.
>>>
>>>> A = 2
>>>> (B = A + 1) <- 1
>>> Error in (B = A + 1) <- 1 : could not find function "(<-"
>>>
>>> Regarding the practical implementation, I think `collapse::%=%` is a
>>> good starting point. It could be introduced in R as a separate function,
>>> or `=` could be modified to accommodate its capability. It should be
>>> clear that
>>> with more than one LHS variables the assignment is an environment level
>>> operation and the results can only be used in computations once assigned
>>> to the environment, e.g. as in 1 + (c("A", "C") %=% init_matrices()),
>>> A and C are not available for the addition in this statement. The
>>> interpretor then needs to be modified to read something like nr, nc =
>>> dim(x) or [nr, nc] = dim(x). as an environment-level multiple assignment
>>> operation with no
>>> immediate value. Appears very feasible to my limited understanding, but
>>> I guess there are other things to consider still. Definitely appreciate
>>> the responses so far though.
>>>
>>> Best regards,
>>>
>>> Sebastian
>>>
>>>
>>>
>>>
>>>
>>> On Sat, 11 Mar 2023 at 20:38, Duncan Murdoch <murdoch.duncan using gmail.com
>>> <mailto:murdoch.duncan using gmail.com>> wrote:
>>>
>>>      On 11/03/2023 11:57 a.m., Ivan Krylov wrote:
>>>       > On Sat, 11 Mar 2023 11:11:06 -0500
>>>       > Duncan Murdoch <murdoch.duncan using gmail.com
>>>      <mailto:murdoch.duncan using gmail.com>> wrote:
>>>       >
>>>       >> That's clear, but your proposal violates a very basic property
>>>      of the
>>>       >> language, i.e. that all statements are expressions and have a value.
>>>       >
>>>       > How about reframing this feature request from multiple assignment
>>>       > (which does go contrary to "everything has only one value, even
>>>      if it's
>>>       > sometimes invisible(NULL)") to "structured binding" / "destructuring
>>>       > assignment" [*], which takes this single single value returned by the
>>>       > expression and subsets it subject to certain rules? It may be
>>>      easier to
>>>       > make a decision on the semantics for destructuring assignment (e.g.
>>>       > languages which have this feature typically allow throwing unneeded
>>>       > parts of the return value away), and it doesn't seem to break as much
>>>       > of the rest of the language if implemented.
>>>       >
>>>       > I see you've already mentioned it ("JavaScript-like"). I think it
>>>      would
>>>       > fulfil Sebastian's requirements too, as long as it is considered
>>>      "true
>>>       > assignment" by the rest of the language.
>>>       >
>>>       > The hard part is to propose the actual grammar of the new feature (in
>>>       > terms of src/main/gram.y, preferably without introducing
>>>      conflicts) and
>>>       > its semantics (including the corner cases, some of which you have
>>>       > already mentioned). I'm not sure I'm up to the task.
>>>       >
>>>
>>>      If I were doing it, here's what I'd propose:
>>>
>>>          '[' formlist ']' LEFT_ASSIGN expr
>>>          '[' formlist ']' EQ_ASSIGN expr
>>>          expr RIGHT_ASSIGN  '[' formlist ']'
>>>
>>>      where `formlist` has the syntax of the formals list for a function
>>>      definition.  This would have the following semantics:
>>>
>>>           {
>>>             *tmp* <- expr
>>>
>>>             # For arguments with no "default" expression,
>>>
>>>             argname1 <- *tmp*[[1]]
>>>             argname2 <- *tmp*[[2]]
>>>             ...
>>>
>>>             # For arguments with a default listed
>>>
>>>             argname3 <- with(*tmp*, default3)
>>>           }
>>>
>>>
>>>      The value of the whole thing would therefore be (invisibly) the
>>>      value of
>>>      the last item in the assignment.
>>>
>>>      Two examples:
>>>
>>>          [A, B, C] <- expr   # assign the first three elements of expr to A,
>>>      B, and C
>>>
>>>          [A, B, C = a + b] <- expr  # assign the first two elements of expr
>>>                                     # to A and B,
>>>                                     # assign with(expr, a + b) to C.
>>>
>>>      Unfortunately, I don't think this could be done entirely by
>>>      transforming
>>>      the expression (which is the way |> was done), and that makes it a lot
>>>      harder to write and to reason about.  E.g. what does this do?
>>>
>>>          A <- 0
>>>          [A, B = A + 10] <- list(1, A = 2)
>>>
>>>      According to the recipe above, I think it sets A to 1 and B to 12, but
>>>      maybe a user would expect B to be 10 or 11.  And according to that
>>>      recipe this is an error:
>>>
>>>          [A, B = A + 10] <- c(1, A = 2)
>>>
>>>      which probably isn't what a user would expect, given that this is fine:
>>>
>>>          [A, B] <- c(1, 2)
>>>
>>>      Duncan Murdoch
>>>
>>
>> ______________________________________________
>> R-devel using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel