[Rd] Change 77844 breaking pkgs [Re: dimnames incoherence?]
William Dunlap
wdun|@p @end|ng |rom t|bco@com
Sat Feb 22 23:18:20 CET 2020
> but then, it seems people want to perpetuate the
> claim of R to be slow
More charitably, I think that the thinking may have been that since x[[i]]
gives you one element of x,
they should use x[[i]]<-value, for scalar i, to stick in one element.
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Sat, Feb 22, 2020 at 12:44 PM Martin Maechler <maechler using stat.math.ethz.ch>
wrote:
> >>>>> Martin Maechler
> >>>>> on Sat, 22 Feb 2020 20:20:49 +0100 writes:
>
> >>>>> William Dunlap
> >>>>> on Fri, 21 Feb 2020 14:05:49 -0800 writes:
>
> >> If we change the behavior NULL--[[--assignment from
>
> >> `[[<-`(NULL, 1, "a" ) # gives "a" (*not* a list)
>
> >> to
>
> >> `[[<-`(NULL, 1, "a" ) # gives list("a")
>
> >> then we have more consistency there *and* your bug is fixed too.
> >> Of course, in other situations back-compatibility would be
> >> broken as well.
>
> >> Would that change the result of
> >> L <- list(One=1) ; L$Two[[1]] <- 2
> >> from the current list(One=1,Two=2) to list(One=1, Two=list(2))
>
> >> and the result of
> >> F <- 1L ; levels(F)[[1]] <- "one"
> >> from structure(1L, levels="one") to structure(1L,
> levels=list("one"))?
>
> > Yes (twice).
>
> > This is indeed what happens in current R-devel, as I had
> > committed the proposition above yesterday.
> > So R-devel (with svn rev >= 77844 ) does this :
>
> >> L <- list(One=1) ; L$Two[[1]] <- 2 ; dput(L)
> > list(One = 1, Two = list(2))
> >> F <- 1L ; levels(F)[[1]] <- "one" ; dput(F)
> > structure(1L, .Label = list("one"))
> >>
>
> > but I find that still considerably more logical than current
> > (pre R-devel) R's
>
> >> L <- list(One=1) ; L$Two[[1]] <- 2 ; dput(L)
> > list(One = 1, Two = 2)
> >> L <- list(One=1) ; L$Two[[1]] <- 2:3 ; dput(L)
> > list(One = 1, Two = list(2:3))
> >>
> >> F <- 1L ; levels(F)[[1]] <- "one" ; dput(F)
> > structure(1L, .Label = "one")
> >> F <- 1L ; levels(F)[[1]] <- c("one", "TWO") ; dput(F)
> > structure(1L, .Label = list(c("one", "TWO")))
> >>
>
>
> >> This change would make L$Name[[1]] <- value act like L$Name$one <-
> value
> >> in cases when L did not have a component named "Name" and value
> >> had length 1.
>
> > (I don't entirely get what you mean, but)
> > indeed,
> > the [[<- assignments will be closer to corresponding $<-
> assignments...
> > which I thought would be another good thing about the change.
>
> >> I have seen users use [[<- where [<- is more appropriate in cases
> like
> >> this. Should there be a way to generate warnings about the change
> in
> >> behavior as you've done with other syntax changes?
>
> > Well, good question.
> > I'd guess one would get such warnings "all over the place", and
> > if a warning is given only once per session it may not be
> > effective ... also the warning be confusing to the 99.9% of R users
> who
> > don't even get what we are talking about here ;-)
>
> > Thank you for your comments.. I did not get too many.
>
> Well, there's one situation where semi-experienced package
> authors are bitten by the new R-devel behavior...
>
> I'm seeing a few dozen CRAN packages breaking in R-devel >= r77884.
>
> One case is exactly as you (Bill) mention above: people using
> dd[[.]] <- .. where they should use single [.].
>
> In one package, I see an inefficient for loop over all rows of a
> data frame 'dd'
>
> for(i in 1:nrow(dd)) {
>
> ...
>
> dd$<nonexisting_column>[[i]] <- <one character string>
>
> }
>
> This used to work -- as said quite inefficiently:
> for i=1 it created the **full** data frame column and then,
> once the column exists, it presumably does assign one entry
> after the other...
>
> Now this code breaks (later!) in the package now, because the
> new column ends up as a *list* of strings, instead of a vector
> of strings.
>
> I think there are quite a few such cases also in other CRAN
> packages which now break with the latest R-devel.
>
> Coming back to Bill Dunlap's question: Should we not warn here?
> And now when our toplevel list is a data frame, maybe we should
> warn indeed, if we can easily limit ourselves to such "bizarre"
> ways of growng a data frame ...
>
>
> dd $ foo [[i]] <- vv
>
> <==>
>
> `*tmp*` <- dd
> dd <- `$<-`(`*tmp*`, value = `[[<-`(`*tmp*`$foo, i, vv))
> rm(`*tmp*`)
>
> but then really we have the same problem as previously: The
> `[[<-`(NULL, i, vv) part does not "know" anything about the
> fact that we are in a data frame column creation context.
>
> If the R package author had used '[i]' instead of '[[i]]'
> he|she would have been safe
>
> (as they would be if they worked more efficiently and created
> the whole variable as a vector and only then added it to the
> data frame ... but then, it seems people want to perpetuate the
> claim of R to be slow ... even if it's them who make R run
> slowly ... ;-))
>
>
[[alternative HTML version deleted]]
More information about the R-devel
mailing list