[Rd] Multiple Assignment built into the R Interpreter?
Sebastian Martin Krantz
@eb@@t|@n@kr@ntz @end|ng |rom gr@du@te|n@t|tute@ch
Sun Mar 12 08:42:51 CET 2023
Thanks Gabriel and Kevin for your inputs,
regarding your points Gabriel, I think Python and Julia do allow multiple
sub-assignment, but in-line with my earlier suggestion in response to
Duncan to make multiple assignment an environment-level operation (like
collapse::%=% currently works), this would not be possible in R.
Regarding the [a] <- coolest_function() syntax, yeah it would mean do
multiple assignment and set a equal to the first element dropping all other
elements. Multiple assignment should be positional loke in other languages,
enabling flexible renaming of objects on the fly. So it should be
irrelevant whether the function returns a named or unnamed list or vector.
Thanks also Kevin for this contribution. I think it’s a remarkable effort,
and I wouldn’t mind such semantics e.g. making it a function call to ‘.[‘
or any other one-letter function, as long as it’s coded in C and recognized
by the interpreter as an assignment operation.
Best regards,
Sebastian
On Sun 12. Mar 2023 at 01:00, Kevin Ushey <kevinushey using gmail.com> wrote:
> FWIW, it's possible to get fairly close to your proposed semantics
> using the existing metaprogramming facilities in R. I put together a
> prototype package here to demonstrate:
>
> https://github.com/kevinushey/dotty
>
> The package exports an object called `.`, with a special `[<-.dot` S3
> method which enables destructuring assignments. This means you can
> write code like:
>
> .[nr, nc] <- dim(mtcars)
>
> and that will define 'nr' and 'nc' as you expect.
>
> As for R CMD check warnings, you can suppress those through the use of
> globalVariables(), and that can also be automated within the package.
> The 'dotty' package includes a function 'dotify()' which automates
> looking for such usages in your package, and calling globalVariables()
> so that R CMD check doesn't warn. In theory, a similar technique would
> be applicable to other packages defining similar operators (zeallot,
> collapse).
>
> Obviously, globalVariables() is a very heavy hammer to swing for this
> issue, but you might consider the benefits worth the tradeoffs.
>
> Best,
> Kevin
>
> On Sat, Mar 11, 2023 at 2:53 PM Duncan Murdoch <murdoch.duncan using gmail.com>
> wrote:
> >
> > On 11/03/2023 4:42 p.m., Sebastian Martin Krantz wrote:
> > > Thanks Duncan and Ivan for the careful thoughts. I'm not sure I can
> > > follow all aspects you raised, but to give my limited take on a few:
> > >
> > >> your proposal violates a very basic property of the language, i.e.
> that all statements are expressions and have a value. > What's the value
> of 1 + (A, C = init_matrices()).
> > >
> > > I'm not sure I see the point here. I evaluated 1 + (d = dim(mtcars); nr
> > > = d[1]; nc = d[2]; rm(d)), which simply gives a syntax error,
> >
> >
> > d = dim(mtcars); nr = d[1]; nc = d[2]; rm(d)
> >
> > is not a statement, it is a sequence of 4 statements.
> >
> > Duncan Murdoch
> >
> > as the
> > > above expression should. `%=%` assigns to
> > > environments, so 1 + (c("A", "C") %=% init_matrices()) returns
> > > numeric(0), with A and C having their values assigned.
> > >
> > >> suppose f() returns list(A = 1, B = 2) and I do > B, A <- f() >
> Should assignment be by position or by name?
> > >
> > > In other languages this is by position. The feature is not meant to
> > > replace list2env(), and being able to rename objects in the assignment
> > > is a vital feature of codes
> > > using multi input and output functions e.g. in Matlab or Julia.
> > >
> > >> Honestly, given that this is simply syntactic sugar, I don't think I
> would support it.
> > >
> > > You can call it that, but it would be used by almost every R user
> almost
> > > every day. Simple things like nr, nc = dim(x); values, vectors =
> > > eigen(x) etc. where the creation of intermediate objects
> > > is cumbersome and redundant.
> > >
> > >> I see you've already mentioned it ("JavaScript-like"). I think it
> would fulfil Sebastian's requirements too, as long as it is considered
> "true assignment" by the rest of the language.
> > >
> > > I don't have strong opinions about how the issue is phrased or
> > > implemented. Something like [t, n] = dim(x) might even be more clear.
> > > It's important though that assignment remains by position,
> > > so even if some output gets thrown away that should also be positional.
> > >
> > >> A <- 0 > [A, B = A + 10] <- list(1, A = 2)
> > >
> > > I also fail to see the use of allowing this. something like this is an
> > > error.
> > >
> > >> A = 2
> > >> (B = A + 1) <- 1
> > > Error in (B = A + 1) <- 1 : could not find function "(<-"
> > >
> > > Regarding the practical implementation, I think `collapse::%=%` is a
> > > good starting point. It could be introduced in R as a separate
> function,
> > > or `=` could be modified to accommodate its capability. It should be
> > > clear that
> > > with more than one LHS variables the assignment is an environment level
> > > operation and the results can only be used in computations once
> assigned
> > > to the environment, e.g. as in 1 + (c("A", "C") %=% init_matrices()),
> > > A and C are not available for the addition in this statement. The
> > > interpretor then needs to be modified to read something like nr, nc =
> > > dim(x) or [nr, nc] = dim(x). as an environment-level multiple
> assignment
> > > operation with no
> > > immediate value. Appears very feasible to my limited understanding, but
> > > I guess there are other things to consider still. Definitely appreciate
> > > the responses so far though.
> > >
> > > Best regards,
> > >
> > > Sebastian
> > >
> > >
> > >
> > >
> > >
> > > On Sat, 11 Mar 2023 at 20:38, Duncan Murdoch <murdoch.duncan using gmail.com
> > > <mailto:murdoch.duncan using gmail.com>> wrote:
> > >
> > > On 11/03/2023 11:57 a.m., Ivan Krylov wrote:
> > > > On Sat, 11 Mar 2023 11:11:06 -0500
> > > > Duncan Murdoch <murdoch.duncan using gmail.com
> > > <mailto:murdoch.duncan using gmail.com>> wrote:
> > > >
> > > >> That's clear, but your proposal violates a very basic property
> > > of the
> > > >> language, i.e. that all statements are expressions and have a
> value.
> > > >
> > > > How about reframing this feature request from multiple
> assignment
> > > > (which does go contrary to "everything has only one value, even
> > > if it's
> > > > sometimes invisible(NULL)") to "structured binding" /
> "destructuring
> > > > assignment" [*], which takes this single single value returned
> by the
> > > > expression and subsets it subject to certain rules? It may be
> > > easier to
> > > > make a decision on the semantics for destructuring assignment
> (e.g.
> > > > languages which have this feature typically allow throwing
> unneeded
> > > > parts of the return value away), and it doesn't seem to break
> as much
> > > > of the rest of the language if implemented.
> > > >
> > > > I see you've already mentioned it ("JavaScript-like"). I think
> it
> > > would
> > > > fulfil Sebastian's requirements too, as long as it is considered
> > > "true
> > > > assignment" by the rest of the language.
> > > >
> > > > The hard part is to propose the actual grammar of the new
> feature (in
> > > > terms of src/main/gram.y, preferably without introducing
> > > conflicts) and
> > > > its semantics (including the corner cases, some of which you
> have
> > > > already mentioned). I'm not sure I'm up to the task.
> > > >
> > >
> > > If I were doing it, here's what I'd propose:
> > >
> > > '[' formlist ']' LEFT_ASSIGN expr
> > > '[' formlist ']' EQ_ASSIGN expr
> > > expr RIGHT_ASSIGN '[' formlist ']'
> > >
> > > where `formlist` has the syntax of the formals list for a function
> > > definition. This would have the following semantics:
> > >
> > > {
> > > *tmp* <- expr
> > >
> > > # For arguments with no "default" expression,
> > >
> > > argname1 <- *tmp*[[1]]
> > > argname2 <- *tmp*[[2]]
> > > ...
> > >
> > > # For arguments with a default listed
> > >
> > > argname3 <- with(*tmp*, default3)
> > > }
> > >
> > >
> > > The value of the whole thing would therefore be (invisibly) the
> > > value of
> > > the last item in the assignment.
> > >
> > > Two examples:
> > >
> > > [A, B, C] <- expr # assign the first three elements of expr
> to A,
> > > B, and C
> > >
> > > [A, B, C = a + b] <- expr # assign the first two elements of
> expr
> > > # to A and B,
> > > # assign with(expr, a + b) to C.
> > >
> > > Unfortunately, I don't think this could be done entirely by
> > > transforming
> > > the expression (which is the way |> was done), and that makes it a
> lot
> > > harder to write and to reason about. E.g. what does this do?
> > >
> > > A <- 0
> > > [A, B = A + 10] <- list(1, A = 2)
> > >
> > > According to the recipe above, I think it sets A to 1 and B to 12,
> but
> > > maybe a user would expect B to be 10 or 11. And according to that
> > > recipe this is an error:
> > >
> > > [A, B = A + 10] <- c(1, A = 2)
> > >
> > > which probably isn't what a user would expect, given that this is
> fine:
> > >
> > > [A, B] <- c(1, 2)
> > >
> > > Duncan Murdoch
> > >
> >
> > ______________________________________________
> > R-devel using r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
[[alternative HTML version deleted]]
More information about the R-devel
mailing list