[Rd] Problem in random number generation for Marsaglia-Multicarry + Kinderman-Ramage
peter dalgaard
pd@|gd @end|ng |rom gm@||@com
Thu Aug 12 15:07:52 CEST 2021
With these matters, one has to be careful to distinguish between method error and implementation error.
The reason for changing the RNG setup in R v. 1.7.0 was pretty much this kind of unfortunate interaction between M-M and K-R. There are even more egregious examples for the distribution of maxima of normal variables. Try e.g.
RNGversion("1.6.0") # Marsaglia-Multicarry, Kinderman-Ramage
s <- replicate(1e6,max(rnorm(10)))
plot(density(s))
(A further bug in K-R was fixed in 1.7.1, but that is tangential to this.)
A glimpse of the source of the problem is seen in the "microcorrelations" in this:
RNGkind("Mar");m <- matrix(runif(4e7),2)
plot(m[1,],m[2,],xlim=c(0,1e-3),pch=".")
m <- matrix(runif(4e7),2)
points(m[1,],m[2,],pch=".")
These examples are from 2003, so the issue has been known for almost 2 decades. However, to the best of our knowledge, the M-M RNG is a faithful implementation of their method, so we have left the RNG in R's arsenal, in case someone needed it for some specific purpose.
- pd
> On 12 Aug 2021, at 11:51 , GILLIBERT, Andre <Andre.Gillibert using chu-rouen.fr> wrote:
>
> Dear R developers,
>
>
> In my opinion, I discovered a severe flaw that occur with the combination of the Marsaglia-Multicarry pseudo-random number generator associated to the Kinderman-Ramage algorithm to generate normally distributed numbers.
>
>
> The sample program is very simple (tested on R-4.1.1 x86_64 on Windows 10):
>
> set.seed(1, "Marsaglia-Multicarry", normal.kind="Kinderman-Ramage")
> v=rnorm(1e7)
> poisson.test(sum(v < (-4)))$conf.int # returns c(34.5, 62.5)
> poisson.test(sum(v > (4)))$conf.int # returns c(334.2, 410.7)
> pnorm(-4)*1e7 # returns 316.7
>
>
> There should be approximatively 316 values less than -4 and 316 values greater than +4, bug there are far too few values less than -4.
>
> Results are similar with other random seeds, and things are even more obvious with larger sample sizes.
>
> The Kinderman-Ramage algorithm is fine when combined to Mersenne-Twister, and Marsaglia-Multicarry is fine when combined with the normal.kind="Inversion" algorithm, but the combination of Marsaglia-Multicarry and Kinderman-Ramage seems to have severe flaws.
>
> R should at least warn for that combination !
>
> What do you think? Should I file a bug report?
>
> --
> Sincerely
> Andr� GILLIBERT
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes using cbs.dk Priv: PDalgd using gmail.com
More information about the R-devel
mailing list