[Rd] Hashed environments of size <5 never grow
Martin Maechler
m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Thu Apr 14 08:59:30 CEST 2022
>>>>> Duncan Garmonsway
>>>>> on Mon, 11 Apr 2022 22:24:52 +0100 writes:
> Hello,
> Hashed environments that begin with a (non-default) size
> of 4 or less, will never grow, which is very detrimental
> to performance. For example,
> ```
> n <- 10000
> l <- vector("list", n)
> l <- setNames(l, seq_len(n))
> # Takes a second, and nchains remains 1.
> e1 <- list2env(l, hash = TRUE, size = 1)
> env.profile(e1)$nchains
> # [1] 1
> # Returns instantly, and nchains grows to 6950
> e2 <- list2env(l, hash = TRUE, size = 5)
> env.profile(e2)$nchains
> # [1] 6950
> ```
> The cause is that, when calling the growth function, the new size is
> truncated to an integer. See src/main/envir.c line 440, or
> https://github.com/wch/r-source/blob/d9b9d00b6d2764839f229bf011dda8d027aae227/src/main/envir.c#L440
> Given the hard-coded growth rate of 1.2, any size of 4 or less will be
> truncated back to itself.
> (int) (1 * 1.2 ) = 1
> (int) (2 * 1.2) = 1
> (int) (3 * 1.2) = 1
> (int) (4 * 1.2) = 1
> (int) (5 * 1.2) = 6
Yes. I'm convinced this has been oversight and should be
corrected.
> This is a rare case, and I couldn't find any examples in CRAN packages of
> the `size` argument being used at all, let alone so small. Even so, it
> tripped me up, and could be fixed by using `ceil()` in src/main/envir.c
> line 440 as follows.
> new_table = R_NewHashTable((int)(ceil(HASHSIZE(table) *
> HASHTABLEGROWTHRATE)))
Indeed, this bug would surface very very rarely,
but I agree a fix such as your proposition is appropriate.
I'll do so, also adding a regression test.
Martin Maechler
ETH Zurich and R Core team
> Kind regards,
> Duncan Garmonsway
More information about the R-devel
mailing list