[Rd] dict package: dictionary data structure for R
Martin Maechler
maechler at stat.math.ethz.ch
Tue Jul 24 19:32:47 CEST 2007
>>>>> "HenrikB" == Henrik Bengtsson <hb at stat.berkeley.edu>
>>>>> on Tue, 24 Jul 2007 18:58:04 +0200 writes:
HenrikB> On 7/23/07, Seth Falcon <sfalcon at fhcrc.org> wrote:
>> Bill Dunlap <bill at insightful.com> writes:
>> > With environments, if you use a prime number for the size
>> > you get considerably better results. E.g.,
>>
>> > Perhaps new.env() should push the requested size up
>> > to the next prime by default.
>>
>> Perhaps. I think we should also investigate other hashing functions
>> since computing the next prime and doing so for resizes will take
>> longer than not having to do it and it will add complexity to the
>> code.
HenrikB> An alternative is to hard-wiring primes within a reasonable range:
HenrikB> http://primes.utm.edu/lists/small/millions/
HenrikB> http://www.math.utah.edu/~pa/math/p10000.html
HenrikB> Maybe primes close to 2^n are good enough for this problem:
HenrikB> http://primes.utm.edu/lists/2small/
Yes, I had a similar thought....
Note that you don't need web sites for prime numbers:
my R factorization utilities I had mentioned a few times,
e.g., here
http://tolstoy.newcastle.edu.au/R/help/05/01/10007.html
can give the first few hundred thousand primes quickly enough:
> source("ftp://stat.ethz.ch/U/maechler/R/prime-numbers-fn.R")
> system.time(PS3 <- prime.sieve(prime.sieve(prime.sieve())))
user system elapsed
0.446 0.006 0.452
> head(PS3, 20)
[1] 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71
> tail(PS3, 20)
[1] 273233 273253 273269 273271 273281 273283 273289 273311 273313 273323
[11] 273349 273359 273367 273433 273457 273473 273503 273517 273521 273527
>
There are more prime / factorization utilities in that simple R
source file, but
as I say there, one should really use C code to do this;
but then R has become so fast ...
Martin Maechler, ETH Zurich
HenrikB> Just my $.02
HenrikB> /Henrik
More information about the R-devel
mailing list