[Rd] Multiple Assignment built into the R Interpreter?

@vi@e@gross m@iii@g oii gm@ii@com @vi@e@gross m@iii@g oii gm@ii@com
Sun Mar 12 02:09:20 CET 2023


I am not personally for or against changes to the R main language but do find that too many people keep wanting to change R so it should be like some other language. Many features would be nice, especially if they do not break existing code, but the time and effort and other overheads need to be a consideration.

R has long had a concept of returning a data structure with inner parts such as from calling lm() or ggplot() that can be used to store lots of info, often including saved copies of the data and parameters that generated it, and you can often update the object or query it for specific fields or pass it intact to some next step.

Anyone wanting to return multiple results has classically done something like return a list containing named items and your code had the option of unpacking the parts you want and ignoring others. This may not be modern or elegant but so what?

I could go through a long list of nice things I see in other languages and ask if R has that ability now (perhaps in a package) or should have it.

Do we need a swap like:

  a, b <- b, a

Do we need a dual comparative like:

If ( -5 < X < 5) ...

The two examples work fine in Python as do many other things but Python is not R and cannot trivially do many things R does either.

Many programming languages have been converging in some ways. SCALA version 3 simplified many parts of the language by borrowing their own version of using indentation from languages like Haskell and Python. Personally, I like it but it causes some headaches for older code and you have to either change the code or disable the feature. So, anyone think R should follow through and also allow many places that now use grouping by curly braces or other methods, to use indentation level?

I am NOT saying any additions are impossible but we need to keep the base language working well on existing code or perhaps make a clean break and make a new version of R (I assume logically Q as in S went to S-- (AKA R) so R-- would be Q, LOL!

There is a community of users of a language who are partially here based on existing R packages. Some can probably find more and more functionality along the same lines as Python modules and elsewhere with some work and continue in a language that may fit their personal preferences. But the people responsible for maintaining and developing R are not just casual users and would have to do serious amounts of work choosing what it would look like and how to implement it, and deal with edge cases and complaints.

The new R pipe is a case in point. Why was it added? I mean I have been using various pipes including in the tidyverse for years quite happily. I did not need it added except perhaps for performance reasons. But when it came out, it had a rather glaring incompatibility in some ways with not providing a fairly trivial way to pass the pipeline to anything other than the first positional variable. Sure, they added a kludge using a horrible to read anonymous function syntax that I suspect many will not use and rather find their own ways around. 

But am I glad it was added? Sure. Could they have improved something else instead that was missing? I mean I use the glue package but would love something like an f-string built in that did pretty much stuff like that as can be found in other languages. 

This may sound stupid, but if someone wants features from another language, maybe they should consider using the two languages together and alternating which one diddles with your data as can be done multiple ways already.

Will that solve what the OP wants? Nope. They want whatever tool they are using to become a Swiss Army Knife.

But there is something to be said for a sparse language that does a few things well and does not grow just to grow and be like everyone else.






-----Original Message-----
From: R-devel <r-devel-bounces using r-project.org> On Behalf Of Gabriel Becker
Sent: Saturday, March 11, 2023 5:54 PM
To: Sebastian Martin Krantz <sebastian.krantz using graduateinstitute.ch>
Cc: r-devel <r-devel using r-project.org>
Subject: Re: [Rd] Multiple Assignment built into the R Interpreter?

There are some other considerations too (apologies if these were mentioned
above and I missed them). Also below are initial thoughts, so apologies for
any mistakes or oversights.

For example, if

[a, b] <- my2valuefun()

works the same as

local({
tmp <- my2valuefun()
stopifnot(is.list(tmp) && length(tmp) == 2)
a <<- tmp[[1]]
b <<- tmp[[2]]
})

Do we expect

[a[1], b[3]] <- my2valuefun()

to also work? That doesn't sound very fun to me, personally, but obviously
the "single value return" versions of these do work and have for a long
time, i.e.

a[1] <- my2valuefun()[[1]]
b[3] <- my2valuefun()[[2]]

is perfectly valid R code (though it does call the function twice which is
"silly" in some sense).

Another thing which arises from the Julia API specifically which I think is
problematic is the ambiguity of's atomic "types" being vectors. Consider
the following

coolest_function <- function() c(a = 15, b = 65, c = 275)
a <- coolest_function()

That obviously makes a vector of length 3. Anything else would break *like
all the R code*

But now, what does

[a] <- coolest_function()

do? Does it assign 15 to a, because b and c arent' being assigned to?

Does this mean variables being assigned to actually need to *match the
names within the return object*? I don't think that would work at all in
general...

Alternatively, is the second one an error, because the function isn't
returning a list? This doesn't really fix the problem either though

Because a single list of length > 1 *is a valid thing to return from an R
function*. I think, like in Julia, you'd need to declare the set of things
being returned, and perhaps map them to the variables you want assigned

crazy_notworking_fun <- function() {
  return(a = 5, b = 65, c = 275)
}

[a_val = a, b_val = b] <- crazy_notworking_fun()

Or even,

[a_val <- a, b_val <-b] <- crazy_notworking_fun()


In that case, however, it becomes somewhat unclear (to me at least) what

only_val <- crazy_notworking_fun()

would do. Throw an error because multivalued functions are fundamentally
different and we can't pretend they aren't? This would disallow all of the
things you think "most r users would use every day" (a claim I'm somewhat
skeptical of, to be honest). If thats not it, though, what? I don't think
it can/should return the full list of results, because that introduces the
ambiguity this is trying to avoid right back in.  Perhaps just the first
thing returned? That is internally consistent, but  somewhat strange
behavior...

Best,
~G




On Sat, Mar 11, 2023 at 2:15 PM Sebastian Martin Krantz <
sebastian.krantz using graduateinstitute.ch> wrote:

> Thanks Duncan and Ivan for the careful thoughts. I'm not sure I can follow
> all aspects you raised, but to give my limited take on a few:
>
> > your proposal violates a very basic property of the  language, i.e. that
> all statements are expressions and have a value.
> > What's the value of 1 + (A, C = init_matrices()).
>
> I'm not sure I see the point here. I evaluated  1 + (d = dim(mtcars);
> nr = d[1]; nc = d[2]; rm(d)), which simply gives a syntax error, as
> the above expression should. `%=%` assigns to
> environments, so 1 + (c("A", "C") %=% init_matrices()) returns
> numeric(0), with A and C having their values assigned.
>
> > suppose f() returns list(A = 1, B = 2) and I do
> >  B, A <- f()
> > Should assignment be by position or by name?
>
> In other languages this is by position. The feature is not meant to
> replace list2env(), and being able to rename objects in the assignment
> is a vital feature of codes
> using multi input and output functions e.g. in Matlab or Julia.
>
> > Honestly, given that this is simply syntactic sugar, I don't think I
> would support it.
>
> You can call it that, but it would be used by almost every R user
> almost every day. Simple things like nr, nc = dim(x); values, vectors
> = eigen(x) etc. where the creation of intermediate objects
> is cumbersome and redundant.
>
> > I see you've already mentioned it ("JavaScript-like"). I think it would
> fulfil Sebastian's requirements too, as long as it is considered "true
> assignment" by the rest of the language.
>
> I don't have strong opinions about how the issue is phrased or
> implemented. Something like [t, n] = dim(x) might even be more clear.
> It's important though that assignment remains by position,
> so even if some output gets thrown away that should also be positional.
>
> >  A <- 0
> >  [A, B = A + 10] <- list(1, A = 2)
>
> I also fail to see the use of allowing this. something like this is an
> error.
>
> > A = 2
> > (B = A + 1) <- 1
> Error in (B = A + 1) <- 1 : could not find function "(<-"
>
> Regarding the practical implementation, I think `collapse::%=%` is a
> good starting point. It could be introduced in R as a separate
> function, or `=` could be modified to accommodate its capability. It
> should be clear that
> with more than one LHS variables the assignment is an environment
> level operation and the results can only be used in computations once
> assigned to the environment, e.g. as in 1 + (c("A", "C") %=%
> init_matrices()),
> A and C are not available for the addition in this statement. The
> interpretor then needs to be modified to read something like nr, nc =
> dim(x) or [nr, nc] = dim(x). as an environment-level multiple
> assignment operation with no
> immediate value. Appears very feasible to my limited understanding,
> but I guess there are other things to consider still. Definitely
> appreciate the responses so far though.
>
> Best regards,
>
> Sebastian
>
>
>
>
>
> On Sat, 11 Mar 2023 at 20:38, Duncan Murdoch <murdoch.duncan using gmail.com>
> wrote:
>
> > On 11/03/2023 11:57 a.m., Ivan Krylov wrote:
> > > On Sat, 11 Mar 2023 11:11:06 -0500
> > > Duncan Murdoch <murdoch.duncan using gmail.com> wrote:
> > >
> > >> That's clear, but your proposal violates a very basic property of the
> > >> language, i.e. that all statements are expressions and have a value.
> > >
> > > How about reframing this feature request from multiple assignment
> > > (which does go contrary to "everything has only one value, even if it's
> > > sometimes invisible(NULL)") to "structured binding" / "destructuring
> > > assignment" [*], which takes this single single value returned by the
> > > expression and subsets it subject to certain rules? It may be easier to
> > > make a decision on the semantics for destructuring assignment (e.g.
> > > languages which have this feature typically allow throwing unneeded
> > > parts of the return value away), and it doesn't seem to break as much
> > > of the rest of the language if implemented.
> > >
> > > I see you've already mentioned it ("JavaScript-like"). I think it would
> > > fulfil Sebastian's requirements too, as long as it is considered "true
> > > assignment" by the rest of the language.
> > >
> > > The hard part is to propose the actual grammar of the new feature (in
> > > terms of src/main/gram.y, preferably without introducing conflicts) and
> > > its semantics (including the corner cases, some of which you have
> > > already mentioned). I'm not sure I'm up to the task.
> > >
> >
> > If I were doing it, here's what I'd propose:
> >
> >    '[' formlist ']' LEFT_ASSIGN expr
> >    '[' formlist ']' EQ_ASSIGN expr
> >    expr RIGHT_ASSIGN  '[' formlist ']'
> >
> > where `formlist` has the syntax of the formals list for a function
> > definition.  This would have the following semantics:
> >
> >     {
> >       *tmp* <- expr
> >
> >       # For arguments with no "default" expression,
> >
> >       argname1 <- *tmp*[[1]]
> >       argname2 <- *tmp*[[2]]
> >       ...
> >
> >       # For arguments with a default listed
> >
> >       argname3 <- with(*tmp*, default3)
> >     }
> >
> >
> > The value of the whole thing would therefore be (invisibly) the value of
> > the last item in the assignment.
> >
> > Two examples:
> >
> >    [A, B, C] <- expr   # assign the first three elements of expr to A,
> > B, and C
> >
> >    [A, B, C = a + b] <- expr  # assign the first two elements of expr
> >                               # to A and B,
> >                               # assign with(expr, a + b) to C.
> >
> > Unfortunately, I don't think this could be done entirely by transforming
> > the expression (which is the way |> was done), and that makes it a lot
> > harder to write and to reason about.  E.g. what does this do?
> >
> >    A <- 0
> >    [A, B = A + 10] <- list(1, A = 2)
> >
> > According to the recipe above, I think it sets A to 1 and B to 12, but
> > maybe a user would expect B to be 10 or 11.  And according to that
> > recipe this is an error:
> >
> >    [A, B = A + 10] <- c(1, A = 2)
> >
> > which probably isn't what a user would expect, given that this is fine:
> >
> >    [A, B] <- c(1, 2)
> >
> > Duncan Murdoch
> >
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

	[[alternative HTML version deleted]]

______________________________________________
R-devel using r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list