[Rd] A bug in the R Mersenne Twister (RNG) code?
Dirk Eddelbuettel
edd at debian.org
Wed Aug 31 17:30:07 CEST 2016
On 30 August 2016 at 18:29, Duncan Murdoch wrote:
| I don't see evidence of a bug. There have been several versions of the
| MT; we may be using a different version than you are. Ours is the
| 1999/10/28 version; the web page you cite uses one from 2002.
|
| Perhaps the newer version fixes some problems, and then it would be
| worth considering a change. But changing the default RNG definitely
| introduces problems in reproducibility, so it's not obvious that we
| would do it.
Yep. FWIW the GNU GSL adopted the 2002 version a while ago too. Quoting from
https://www.gnu.org/software/gsl/manual/html_node/Random-number-generator-algorithms.html
Generator: gsl_rng_mt19937
The MT19937 generator of Makoto Matsumoto and Takuji Nishimura is a
variant of the twisted generalized feedback shift-register algorithm, and
is known as the “Mersenne Twister” generator. It has a Mersenne prime
period of 2^19937 - 1 (about 10^6000) and is equi-distributed in 623
dimensions. It has passed the DIEHARD statistical tests. It uses 624 words
of state per generator and is comparable in speed to the other
generators. The original generator used a default seed of 4357 and
choosing s equal to zero in gsl_rng_set reproduces this. Later versions
switched to 5489 as the default seed, you can choose this explicitly via
gsl_rng_set instead if you require it.
For more information see,
Makoto Matsumoto and Takuji Nishimura, “Mersenne Twister: A
623-dimensionally equidistributed uniform pseudorandom number
generator”. ACM Transactions on Modeling and Computer Simulation,
Vol. 8, No. 1 (Jan. 1998), Pages 3–30 The generator gsl_rng_mt19937
uses the second revision of the seeding procedure published by the two
authors above in 2002. The original seeding procedures could cause
spurious artifacts for some seed values. They are still available
through the alternative generators gsl_rng_mt19937_1999 and
gsl_rng_mt19937_1998.
Note the last sentence here.
This is all somewhat technical code, so when I noticed the above I could
never figure what exactly R was doing in its implementation. But "innocent
until proven guilty" -- a sufficient number of people ought to have looked at
this -- so I saw no need to pursue this further.
Dirk
--
http://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org
More information about the R-devel
mailing list