[Rd] Multiple Assignment built into the R Interpreter?
Duncan Murdoch
murdoch@dunc@n @end|ng |rom gm@||@com
Sat Mar 11 17:11:06 CET 2023
On 11/03/2023 9:54 a.m., Sebastian Martin Krantz wrote:
> Thanks Duncan,
>
> I know about list2env(), in fact a previous version of collapse::`%=%`
> was coded as
>
> "%=%" <- function(lhs, rhs) {
> if(!is.character(lhs)) stop("lhs needs to be character")
> if(!is.list(rhs)) rhs <- as.vector(rhs, "list")
> if(length(lhs) != length(rhs)) stop("length(lhs) not equal to
> length(rhs)")
> list2env(`names<-`(rhs, lhs), envir = parent.frame())
> invisible()
> }
>
> but as you say, the input needs to be converted to a list, and it calls
> several R functions, which led me to end up writing `%=%` in C:
> https://github.com/SebKrantz/collapse/blob/master/src/small_helper.c#L162 <https://github.com/SebKrantz/collapse/blob/master/src/small_helper.c#L162>.
> This implementation works in the way you describe, i.e. it has separate
> methods for all the standard vector types, and coerces to list otherwise.
>
> That being said, all implementations in packages falls short of being
> very useful, because R CMD Check it will still require global bindings
> for variables,
> unless this becomes a standard feature of the language. So I cannot use
> this in packages, and there is still a performance cost to it, in my
> case a call to
> .Call() and parent.frame(), which is quite low, but still high compared
> to the cost of `<-` or `=`.
Another R way to do what you're doing would be to stay within a list the
whole time, i.e. code it as
mats <- init_matrices(X, Y, Z)
with(mats, ... do things with A, C, Q, and R ... )
This won't give warnings about globals, and it makes very clear that
those 4 matrices are all closely related, and it allows you to work with
multiple 4-tuples of matrices, etc.
> So what I am requesting is indeed nothing less than to consider making
> this a permanent feature of the language itself.
That's clear, but your proposal violates a very basic property of the
language, i.e. that all statements are expressions and have a value.
What's the value of
1 + (A, C = init_matrices())
? I think you would disallow the above (though you didn't address it
when I raised it the first time), which means there would now be two
kinds of statements: ones that are expressions and therefore can be
used as function arguments, and ones that aren't.
> Given that the other 3 major scientific computing languages (Matlab,
> Python and Julia) have implemented it very successfully,
> I don't think the general practicality of it should be an issue.
> Regarding implementation in other languages, Julia works as follows:
>
> function init_matrices()
> A = 1; C = 2; Q = 3; R = 4
> return A, C, Q, R
> end
>
> res = init_matrices() # gives a Julia Tuple (A, C, Q, R)
> A, C = init_matrices() # Works, A is 1, C is 2, the others are
> dropped
That's pretty ugly having a singular LHS handled so much differently
from a plural LHS.
> A, C, Q, R = init_matrices() # Standard
>
> I think as far as R is concerned multiple return values are not really
> necessary given that one can always,
> return(list(A, C, Q, R)), although of course there is also a cost to
> list(). I also wouldn't mind being strict about it and
> not allowing A, C = init_matrices(), but others might disagree.
Another ambiguity: suppose f() returns list(A = 1, B = 2) and I do
B, A <- f()
Should assignment be by position or by name?
Honestly, given that this is simply syntactic sugar, I don't think I
would support it.
Duncan Murdoch
>
> Best regards,
>
> Sebastian
>
>
> On Sat, 11 Mar 2023 at 15:37, Duncan Murdoch <murdoch.duncan using gmail.com
> <mailto:murdoch.duncan using gmail.com>> wrote:
>
> I think the standard way to do this in R is given by list2env(), as
> described in a couple of answers on the SO page you linked.
>
> The syntax you proposed would be likely to be confusing in complex
> expressions, e.g.
>
> f(A, C, Q, R = init_matrices(X, Y, Z))
>
> would obviously not work but wouldn't trigger a syntax error, and
>
> f((A, C, Q, R = init_matrices(X, Y, Z)))
>
> could work, but looks too much like the previous one. So I think R
> would want Javascript-like
>
> [A, C, Q, R] <- init_matrices(X, Y, Z)
>
> instead. But then the question would come up about how to handle the
> RHS. Does the function have to return a list? What if the length of
> the list is not 4? Or is it just guaranteed to be equivalent to
>
> temp <- init_matrices(X, Y, Z)
> A <- temp[[1]]
> C <- temp[[2]]
> Q <- temp[[3]]
> R <- temp[[4]]
>
> which would work for other vector types besides lists?
>
> BTW, here's a little hack that almost works:
>
> `vals<-` <- function(x, ..., value) {
> others <- substitute(list(...))
> if (length(others) > 1)
> for (i in seq_along(others)[-1])
> assign(as.character(others[[i]]), value[[i]], envir =
> parent.frame())
> value[[1]]
> }
>
> You call it as
>
> vals(a, b, c) <- 1:3
>
> and it assigns 1 to a, 2 to b, and 3 to c. It doesn't quite do what
> you
> want because it requires that a exists already, but b and c don't
> have to.
>
> Duncan Murdoch
>
> On 11/03/2023 4:04 a.m., Sebastian Martin Krantz wrote:
> > Dear R Core,
> >
> > working on my dynamic factor modelling package, which requires
> several
> > subroutines to create and update several system matrices, I come
> back to
> > the issue of being annoyed by R not supporting multiple
> assignment out of
> > the box like Matlab, Python and julia. e.g. something like
> >
> > A, C, Q, R = init_matrices(X, Y, Z)
> >
> > would be a great addition to the language. I know there are several
> > workarounds such as the %<-% operator in the zeallot package or
> my own %=%
> > operator in collapse, but these don't work well for package
> development as
> > R CMD Check warns about missing global bindings for the created
> variables,
> > e.g. I would have to use
> >
> > A <- C <- Q <- R <- NULL
> > .c(A, C, Q, R) %=% init_matrices(X, Y, Z)
> >
> > in a package, which is simply annoying. Of course the standard way of
> >
> > init <- init_matrices(X, Y, Z)
> > A <- init$A; C <- init$C; Q <- init$Q; R <- init$R
> > rm(init)
> >
> > is also super cumbersome compared to Python or Julia. Another
> reason is of
> > course performance, even my %=% operator written in C has a
> non-negligible
> > performance cost for very tight loops, compared to a solution at the
> > interpretor level or in a primitive function such as `=`.
> >
> > So my conclusion at this point is that it is just significantly
> easier to
> > implement such codes in Julia, in addition to the greater
> performance it
> > offers. There are obvious reasons why I am still coding in R and
> C, thanks
> > to the robust API and great ecosystem of packages, but adding
> this could be
> > a presumably low-hanging fruit to make my life a bit easier.
> Several issues
> > for this have been filed on Stackoverflow, the most popular one (
> >
> https://stackoverflow.com/questions/7519790/assign-multiple-new-variables-on-lhs-in-a-single-line <https://stackoverflow.com/questions/7519790/assign-multiple-new-variables-on-lhs-in-a-single-line>)
> > has been viewed 77 thousand times.
> >
> > But maybe this has already been discussed here and already
> decided against.
> > In that case, a way to browse R-devel archives to find out would
> be nice.
> >
> > Best regards,
> >
> > Sebastian
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-devel using r-project.org <mailto:R-devel using r-project.org> mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> <https://stat.ethz.ch/mailman/listinfo/r-devel>
>
More information about the R-devel
mailing list