[Rd] dimnames incoherence?

Martin Maechler m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Sat Feb 22 20:20:49 CET 2020


>>>>> William Dunlap 
>>>>>     on Fri, 21 Feb 2020 14:05:49 -0800 writes:

    > If we change the behavior  NULL--[[--assignment from

    > `[[<-`(NULL, 1, "a" ) # gives  "a"  (*not* a list)

    > to

    > `[[<-`(NULL, 1, "a" ) # gives  list("a")

    > then we have more consistency there *and* your bug is fixed too.
    > Of course, in other situations back-compatibility would be
    > broken as well.

    > Would that change the result of
    > L <- list(One=1) ; L$Two[[1]] <- 2
    > from the current list(One=1,Two=2) to list(One=1, Two=list(2))

    > and the result of
    > F <- 1L ; levels(F)[[1]] <- "one"
    > from structure(1L, levels="one") to structure(1L, levels=list("one"))?

Yes (twice).

This is indeed what happens in current R-devel, as I had
committed the proposition above yesterday.
So R-devel (with svn rev >= 77844 )  does this :

> L <- list(One=1) ; L$Two[[1]] <- 2 ; dput(L)
list(One = 1, Two = list(2))
>  F <- 1L ; levels(F)[[1]] <- "one" ; dput(F)
structure(1L, .Label = list("one"))
> 

but I find that still considerably more logical than current
(pre R-devel) R's

> L <- list(One=1) ; L$Two[[1]] <- 2 ; dput(L)
list(One = 1, Two = 2)
> L <- list(One=1) ; L$Two[[1]] <- 2:3 ; dput(L)
list(One = 1, Two = list(2:3))
> 
> F <- 1L ; levels(F)[[1]] <- "one" ; dput(F)
structure(1L, .Label = "one")
> F <- 1L ; levels(F)[[1]] <- c("one", "TWO") ; dput(F)
structure(1L, .Label = list(c("one", "TWO")))
>


    > This change would make L$Name[[1]] <- value act like L$Name$one <- value
    > in cases when L did not have a component named "Name" and value
    > had length 1.

(I don't entirely get what you mean, but)
indeed,
the  [[<-  assignments will be closer to corresponding $<-  assignments...
which I thought would be another good thing about the change.

    > I have seen users use [[<- where [<- is more appropriate in cases like
    > this.  Should there be a way to generate warnings about the change in
    > behavior as you've done with other syntax changes?

Well, good question.
I'd guess one would get such warnings "all over the place",  and
if a warning is given only once per session it may not be
effective  ... also the warning be confusing to the 99.9% of R users who
don't even get what we are talking about here ;-)

Thank you for your comments.. I did not get too many.

Martin

    > Bill Dunlap
    > TIBCO Software
    > wdunlap tibco.com


    > On Wed, Feb 19, 2020 at 12:59 PM Martin Maechler <maechler using stat.math.ethz.ch>
    > wrote:

    >> >>>>> Martin Maechler
    >> >>>>>     on Wed, 19 Feb 2020 18:06:57 +0100 writes:
    >> 
    >> >>>>> Serguei Sokol
    >> >>>>>     on Wed, 19 Feb 2020 15:21:21 +0100 writes:
    >> 
    >> >> Hi,
    >> >> I was bitten by a little incoherence in dimnames assignment or may
    >> be I
    >> >> missed some point.
    >> >> Here is the case. If I assign row names via dimnames(a)[[1]], when
    >> >> nrow(a)=1 then an error is thrown. But if I do the same when
    >> nrow(a) > 1
    >> >> it's OK. Is one of this case works unexpectedly? Both? Neither?
    >> 
    >> >> a=as.matrix(1)
    >> >> dimnames(a)[[1]]="a" # error: 'dimnames' must be a list
    >> 
    >> >> aa=as.matrix(1:2)
    >> >> dimnames(aa)[[1]]=c("a", "b") # OK
    >> 
    >> >> In the second case, dimnames(aa) is not a list (like in the first
    >> case)
    >> >> but it works.
    >> >> I would expect that the both work or neither.
    >> 
    >> > I agree (even though I'm strongly advising people to use '<-'
    >> > instead of '=');
    >> > which in this case helps you get the name of the function really
    >> > involved:  It is  `dimnames<-`  (which is implemented in C
    >> > entirely, for matrices and other arrays).
    >> 
    >> As a matter of fact, I wrote too quickly, the culprit here is
    >> the  `[[<-`  function (rather than `dimnames<-`),
    >> which has a special "inconsistency" feature when used to "add to NULL";
    >> almost surely inherited from S,  but I now think we should
    >> consider dropping on the occasion of aiming for  R 4.0.0 :
    >> 
    >> It's documented in ?Extract  that  length 1  `[[.]]`-assignment works
    >> specially for NULL (and dimnames(.) are NULL here).
    >> 
    >> Note you need to read and understand one of the tougher sections
    >> in the official  'R Language Definition'  Manual,
    >> section -- 3.4.4 Subset assignment ---
    >> i.e.,
    >> 
    >> https://cran.r-project.org/doc/manuals/r-release/R-lang.html#Subset-assignment
    >> 
    >> notably this part:
    >> 
    >> Nesting of complex assignments is evaluated recursively
    >> 
    >> names(x)[3] <- "Three"
    >> 
    >> is equivalent to
    >> 
    >> `*tmp*` <- x
    >> x <- "names<-"(`*tmp*`, value="[<-"(names(`*tmp*`), 3, value="Three"))
    >> rm(`*tmp*`)
    >> 
    >> and then, apply this to our    dimnames(a)[[1]] <- "a"
    >> and so  replace
    >> 
    >> -  'names<-' by 'dimnames<-'
    >> -  '[<-'     by '[[<-'
    >> 
    >> ----------
    >> 
    >> Here is the rest of my analysis as valid R code
    >> {this is not new, Peter Dalgaard had explained this 10 or 20
    >> years ago to a mailing list audience IIRC} :
    >> 
    >> ## MM: The problematic behavior (bug ?) is in `[[<-`, not in `dimnames<-` :
    >> 
    >> `[[<-`(NULL, 1,   "a"     ) # gives  "a"  (*not* a list)
    >> `[[<-`(NULL, 1, c("a","b")) # gives list(c("a","b"))  !!
    >> 
    >> ##==> in C code: in  subassign.c  [ ~/R/D/r-devel/R/src/main/subassign.c ]
    >> ##==> function (~ 340 lines)
    >> ##            do_subassign2_dflt(SEXP call, SEXP op, SEXP args, SEXP rho)
    >> ## has
    >> "
    >> line svn r.  svn auth. c.o.d.e...
    >> ---- ------  --------- ----------------------------------------------
    >> 1741   4166      ihaka     if (isNull(x)) {
    >> 1742  45446     ripley         if (isNull(y)) {
    >> 1743  76166       luke             UNPROTECT(2); /* args, y */
    >> 1744   4166      ihaka             return x;
    >> 1745  45446     ripley         }
    >> 1746  35680    murdoch         if (length(y) == 1)
    >> 1747  68094       luke             x = allocVector(TYPEOF(y), 0);
    >> 1748  24954     ripley         else
    >> 1749  68094       luke             x = allocVector(VECSXP, 0);
    >> 1750   1820      ihaka     }
    >> ---- ------  --------- ----------------------------------------------
    >> "
    >> ## so clearly, in case the value is of length 1, no list is created .
    >> 
    >> ## For dimnames<-  Replacing NULL by list()  should be done in both cases
    >> , and then things work :
    >> `[[<-`(list(), 1,   "a"     ) # gives list( "a" )
    >> `[[<-`(list(), 1, c("a","b")) # gives list(c("a","b"))  !!
    >> 
    >> ## but the problem here is that  `[[<-` at this time in the game
    >> ## does *not* know that it comes from dimnames<- ....
    >> 
    >> ---------------
    >> 
    >> If we change the behavior  NULL--[[--assignment from
    >> 
    >> `[[<-`(NULL, 1, "a" ) # gives  "a"  (*not* a list)
    >> 
    >> to
    >> 
    >> `[[<-`(NULL, 1, "a" ) # gives  list("a")
    >> 
    >> 
    >> then we have more consistency there *and* your bug is fixed too.
    >> Of course, in other situations back-compatibility would be
    >> broken as well.
    >> 
    >> At the moment, I think we (R Core Team) should consider doing
    >> that here.
    >> 
    >> Martin
    >> 
    >> 
    >> 
    >> >> Your thoughts are welcome.
    >> 
    >> > I think we'd be happy if you report this formally on R's
    >> > bugzilla - https://bugs.r-project.org/ - as a bug.
    >> 
    --> https://www.r-project.org/bugs.html
    >> 
    >> >> From reading bugs.html you note that you should ask for
    >> >> an account there;
    >> > as I'm one of the people who get such request by e-mail,
    >> > in this case, I can do it directly (if you confirm you'd
    >> > want in a private e-mail).
    >> 
    >> >> Best, Serguei.
    >> 
    >> >> PS the same apply for dimnames(a)[[2]]<-.
    >> 
    >> > (of course)
    >> 
    >> > NB *and* importantly, the buglet is still in current
    >> > versions of R
    >> 
    >> > Best, Martin
    >> 
    >> >>> sessionInfo()
    >> >> R version 3.6.1 (2019-07-05) Platform:
    >> >> x86_64-pc-linux-gnu (64-bit) Running under: Mageia 7
    >> 
    >> >> Matrix products: default BLAS/LAPACK:
    >> >> /home/opt/OpenBLAS/lib/libopenblas_sandybridge-r0.3.6.so
    >> 
    >> >> locale:  [1] LC_CTYPE=fr_FR.UTF-8       LC_NUMERIC=C  [3]
    >> >> LC_TIME=fr_FR.UTF-8        LC_COLLATE=fr_FR.UTF-8  [5]
    >> >> LC_MONETARY=fr_FR.UTF-8    LC_MESSAGES=fr_FR.UTF-8  [7]
    >> >> LC_PAPER=fr_FR.UTF-8       LC_NAME=C  [9]
    >> >> LC_ADDRESS=C               LC_TELEPHONE=C [11]
    >> >> LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C
    >> 
    >> >> attached base packages: [1] parallel  stats     graphics
    >> >> grDevices utils     datasets methods [8] base
    >> 
    >> >> other attached packages: [1] multbxxc_1.0.1
    >> >> rmumps_5.2.1-11 [3] arrApply_2.1
    >> >> RcppArmadillo_0.9.800.4.0 [5] Rcpp_1.0.3
    >> >> slam_0.1-47 [7] nnls_1.4
    >> 
    >> >> loaded via a namespace (and not attached): [1]
    >> >> compiler_3.6.1   tools_3.6.1      codetools_0.2-16
    >> 
    >> >> ______________________________________________
    >> >> R-devel using r-project.org mailing list
    >> >> https://stat.ethz.ch/mailman/listinfo/r-devel
    >> 
    >> > ______________________________________________
    >> > R-devel using r-project.org mailing list
    >> > https://stat.ethz.ch/mailman/listinfo/r-devel
    >> 
    >> ______________________________________________
    >> R-devel using r-project.org mailing list
    >> https://stat.ethz.ch/mailman/listinfo/r-devel
    >>



More information about the R-devel mailing list