[Rd] A bug in the R Mersenne Twister (RNG) code?

Mark Roberts ersatz.too at gmail.com
Tue Aug 30 23:45:37 CEST 2016


Whomever,

I recently sent the "bug report" below toR-core at r-project.org and have 
just been asked to instead submit it to you.

Although I am basically not an R user, I have installed version 3.3.1 
and am also the author of a statistics program written in Visual Basic 
that contains a component which correctly implements the Mersenne 
Twister (MT) algorithm.  I believe that it is not possible to generate 
the correct stream of pseudorandom numbers using the MT default random 
number generator in R, and am not the first person to notice this.  Here 
is a posted 2013 entry 
(www.r-bloggers.com/reproducibility-and-randomness/) on an R website 
that asserts that the SAS computer program implementation of the MT 
algorithm produces different numbers than R does when using the same 
starting seed number.  The author of this post didn’t get anyone to 
respond to his query about the reason for this SAS vs. R discrepancy.

There are two ways of initializing the original MT computer program 
(written in C) so that an identical stream of numbers can be repeatedly 
generated:  1) with a particular integer seed number, and 2) with a 
particular array of integers.   In the 'compilation and usage' section 
of this webpage (https://github.com/cslarsen/mersenne-twister) there is 
a listing of the first 200 random numbers the MT algorithm should 
produce for seed number = 1.  The inventors of the Mersenne Twister 
random number generator provided two different sets of the first 1000 
numbers produced by a correctly coded 32-bit implementation of the MT 
algorithm when initializing it with a particular array of integers at: 
www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/MT2002/CODES/mt19937ar.out. 
[There is a link to this output at: 
www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/MT2002/emt19937ar.html.]

My statistics program obtains exactly those 200 numbers from the first 
site mentioned in the previous paragraph and also obtains those same 
numbers from the second website (though I didn't check all 2000 values). 
   Assuming that the MT code within R uses the 32-bit MT algorithm, I 
suspect that the current version of R can't do that.  If you (i.e., 
anyone who might knowledgeably respond to this report) is able to 
duplicate those reference test-values, then please send me the R code to 
initialize the MT code within R to successfully do that, and I apologize 
for having wasted your time. If you (collectively) can't do that, then R 
is very likely using incorrectly implemented MT code.  And if this 
latter possibility is true, it seems to me that this is something that 
should be fixed.

Mark Roberts, Ph.D.

	[[alternative HTML version deleted]]



More information about the R-devel mailing list