Random {base} | R Documentation |
Random Number Generation
Description
.Random.seed
is an integer vector, containing the random number
generator (RNG) state for random number generation in R. It
can be saved and restored, but should not be altered by the user.
RNGkind
is a more friendly interface to query or set the kind
of RNG in use.
RNGversion
can be used to set the random generators as they
were in an earlier R version (for reproducibility).
set.seed
is the recommended way to specify seeds.
Usage
.Random.seed <- c(rng.kind, n1, n2, ...)
RNGkind(kind = NULL, normal.kind = NULL, sample.kind = NULL)
RNGversion(vstr)
set.seed(seed, kind = NULL, normal.kind = NULL, sample.kind = NULL)
Arguments
kind |
character or |
normal.kind |
character string or |
sample.kind |
character string or |
seed |
a single value, interpreted as an integer, or |
vstr |
a character string containing a version number,
e.g., |
rng.kind |
integer code in |
n1 , n2 , ... |
integers. See the details for how many are required
(which depends on |
Details
The currently available RNG kinds are given below. kind
is
partially matched to this list. The default is
"Mersenne-Twister"
.
"Wichmann-Hill"
-
The seed,
.Random.seed[-1] == r[1:3]
is an integer vector of length 3, where eachr[i]
is in1:(p[i] - 1)
, wherep
is the length 3 vector of primes,p = (30269, 30307, 30323)
. The Wichmann–Hill generator has a cycle length of6.9536 \times 10^{12}
(=prod(p-1)/4
, see Applied Statistics (1984) 33, 123 which corrects the original article). It exhibits 12 clear failures in the TestU01 Crush suite and 22 in the BigCrush suite (L'Ecuyer, 2007). "Marsaglia-Multicarry"
:-
A multiply-with-carry RNG is used, as recommended by George Marsaglia in his post to the mailing list ‘sci.stat.math’. It has a period of more than
2^{60}
.It exhibits 40 clear failures in L'Ecuyer's TestU01 Crush suite. Combined with Ahrens-Dieter or Kinderman-Ramage it exhibits deviations from normality even for univariate distribution generation. See PR#18168 for a discussion.
The seed is two integers (all values allowed).
"Super-Duper"
:-
Marsaglia's famous Super-Duper from the 70's. This is the original version which does not pass the MTUPLE test of the Diehard battery. It has a period of
\approx 4.6\times 10^{18}
for most initial seeds. The seed is two integers (all values allowed for the first seed: the second must be odd).We use the implementation by Reeds et al. (1982–84).
The two seeds are the Tausworthe and congruence long integers, respectively.
It exhibits 25 clear failures in the TestU01 Crush suite (L'Ecuyer, 2007).
"Mersenne-Twister"
:-
From Matsumoto and Nishimura (1998); code updated in 2002. A twisted GFSR with period
2^{19937} - 1
and equidistribution in 623 consecutive dimensions (over the whole period). The ‘seed’ is a 624-dimensional set of 32-bit integers plus a current position in that set.R uses its own initialization method due to B. D. Ripley and is not affected by the initialization issue in the 1998 code of Matsumoto and Nishimura addressed in a 2002 update.
It exhibits 2 clear failures in each of the TestU01 Crush and the BigCrush suite (L'Ecuyer, 2007).
"Knuth-TAOCP-2002"
:-
A 32-bit integer GFSR using lagged Fibonacci sequences with subtraction. That is, the recurrence used is
X_j = (X_{j-100} - X_{j-37}) \bmod 2^{30}%
and the ‘seed’ is the set of the 100 last numbers (actually recorded as 101 numbers, the last being a cyclic shift of the buffer). The period is around
2^{129}
. "Knuth-TAOCP"
:-
An earlier version from Knuth (1997).
The 2002 version was not backwards compatible with the earlier version: the initialization of the GFSR from the seed was altered. R did not allow you to choose consecutive seeds, the reported ‘weakness’, and already scrambled the seeds. Otherwise, the algorithm is identical to Knuth-TAOCP-2002, with the same lagged Fibonacci recurrence formula.
Initialization of this generator is done in interpreted R code and so takes a short but noticeable time.
It exhibits 3 clear failure in the TestU01 Crush suite and 4 clear failures in the BigCrush suite (L'Ecuyer, 2007).
"L'Ecuyer-CMRG"
:-
A ‘combined multiple-recursive generator’ from L'Ecuyer (1999), each element of which is a feedback multiplicative generator with three integer elements: thus the seed is a (signed) integer vector of length 6. The period is around
2^{191}
.The 6 elements of the seed are internally regarded as 32-bit unsigned integers. Neither the first three nor the last three should be all zero, and they are limited to less than
4294967087
and4294944443
respectively.This is not particularly interesting of itself, but provides the basis for the multiple streams used in package parallel.
It exhibits 6 clear failures in each of the TestU01 Crush and the BigCrush suite (L'Ecuyer, 2007).
"user-supplied"
:-
Use a user-supplied generator. See
Random.user
for details.
normal.kind
can be "Kinderman-Ramage"
,
"Buggy Kinderman-Ramage"
(not for set.seed
),
"Ahrens-Dieter"
, "Box-Muller"
, "Inversion"
(the
default), or "user-supplied"
. (For inversion, see the
reference in qnorm
.) The Kinderman-Ramage generator
used in versions prior to 1.7.0 (now called "Buggy"
) had several
approximation errors and should only be used for reproduction of old
results. The "Box-Muller"
generator is stateful as pairs of
normals are generated and returned sequentially. The state is reset
whenever it is selected (even if it is the current normal generator)
and when kind
is changed.
sample.kind
can be "Rounding"
or "Rejection"
,
or partial matches to these. The former was the default in versions
prior to 3.6.0: it made sample
noticeably non-uniform
on large populations, and should only be used for reproduction of old
results. See PR#17494 for a discussion.
set.seed
uses a single integer argument to set as many seeds
as are required. It is intended as a simple way to get quite different
seeds by specifying small integer arguments, and also as a way to get
valid seed sets for the more complicated methods (especially
"Mersenne-Twister"
and "Knuth-TAOCP"
). There is no
guarantee that different values of seed
will seed the RNG
differently, although any exceptions would be extremely rare. If
called with seed = NULL
it re-initializes (see ‘Note’)
as if no seed had yet been set.
The use of kind = NULL
, normal.kind = NULL
or
sample.kind = NULL
in
RNGkind
or set.seed
selects the currently-used
generator (including that used in the previous session if the
workspace has been restored): if no generator has been used it selects
"default"
.
Value
.Random.seed
is an integer
vector whose first
element codes the kind of RNG and normal generator. The lowest
two decimal digits are in 0:(k-1)
where k
is the number of available RNGs. The hundreds
represent the type of normal generator (starting at 0
), and
the ten thousands represent the type of discrete uniform sampler.
In the underlying C, .Random.seed[-1]
is unsigned
;
therefore in R .Random.seed[-1]
can be negative, due to
the representation of an unsigned integer by a signed integer.
RNGkind
returns a three-element character vector of the RNG,
normal and sample kinds selected before the call, invisibly if
either argument is not NULL
. A type starts a session as the
default, and is selected either by a call to RNGkind
or by setting
.Random.seed
in the workspace. (NB: prior to R 3.6.0 the first
two kinds were returned in a two-element character vector.)
RNGversion
returns the same information as RNGkind
about
the defaults in a specific R version.
set.seed
returns NULL
, invisibly.
Note
Initially, there is no seed; a new one is created from the current time and the process ID when one is required. Hence different sessions will give different simulation results, by default. However, the seed might be restored from a previous session if a previously saved workspace is restored.
.Random.seed
saves the seed set for the uniform random-number
generator, at least for the system generators. It does not
necessarily save the state of other generators, and in particular does
not save the state of the Box–Muller normal generator. If you want
to reproduce work later, call set.seed
(preferably with
explicit values for kind
and normal.kind
) rather than
set .Random.seed
.
The object .Random.seed
is only looked for in the user's
workspace.
Do not rely on randomness of low-order bits from RNGs. Most of the
supplied uniform generators return 32-bit integer values that are
converted to doubles, so they take at most 2^{32}
distinct
values and long runs will return duplicated values (Wichmann-Hill is
the exception, and all give at least 30 varying bits.)
Author(s)
of RNGkind: Martin Maechler. Current implementation, B. D. Ripley with modifications by Duncan Murdoch.
References
Ahrens, J. H. and Dieter, U. (1973). Extensions of Forsythe's method for random sampling from the normal distribution. Mathematics of Computation, 27, 927–937.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988).
The New S Language.
Wadsworth & Brooks/Cole.
(set.seed
, storing in .Random.seed
.)
Box, G. E. P. and Muller, M. E. (1958). A note on the generation of normal random deviates. Annals of Mathematical Statistics, 29, 610–611. doi:10.1214/aoms/1177706645.
De Matteis, A. and Pagnutti, S. (1993). Long-range Correlation Analysis of the Wichmann-Hill Random Number Generator. Statistics and Computing, 3, 67–70. doi:10.1007/BF00153065.
Kinderman, A. J. and Ramage, J. G. (1976). Computer generation of normal random variables. Journal of the American Statistical Association, 71, 893–896. doi:10.2307/2286857.
Knuth, D. E. (1997).
The Art of Computer Programming.
Volume 2, third edition.
Source code at https://www-cs-faculty.stanford.edu/~knuth/taocp.html.
Knuth, D. E. (2002). The Art of Computer Programming. Volume 2, third edition, ninth printing.
L'Ecuyer, P. (1999). Good parameters and implementations for combined multiple recursive random number generators. Operations Research, 47, 159–164. doi:10.1287/opre.47.1.159.
L'Ecuyer, P. and Simard, R. (2007).
TestU01: A C Library for Empirical Testing of Random Number Generators
ACM Transactions on Mathematical Software, 33, Article 22.
doi:10.1145/1268776.1268777.
The TestU01 C library is available from
https://simul.iro.umontreal.ca/testu01/tu01.html or also
https://github.com/umontreal-simul/TestU01-2009.
Marsaglia, G. (1997).
A random number generator for C.
Discussion paper, posting on Usenet newsgroup sci.stat.math
on
September 29, 1997.
Marsaglia, G. and Zaman, A. (1994). Some portable very-long-period random number generators. Computers in Physics, 8, 117–121. doi:10.1063/1.168514.
Matsumoto, M. and Nishimura, T. (1998).
Mersenne Twister: A 623-dimensionally equidistributed uniform
pseudo-random number generator,
ACM Transactions on Modeling and Computer Simulation,
8, 3–30.
Source code formerly at http://www.math.keio.ac.jp/~matumoto/emt.html
.
Now see http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/VERSIONS/C-LANG/c-lang.html.
Reeds, J., Hubert, S. and Abrahams, M. (1982–4). C implementation of SuperDuper, University of California at Berkeley. (Personal communication from Jim Reeds to Ross Ihaka.)
Wichmann, B. A. and Hill, I. D. (1982). Algorithm AS 183: An Efficient and Portable Pseudo-random Number Generator. Applied Statistics, 31, 188–190; Remarks: 34, 198 and 35, 89. doi:10.2307/2347988.
See Also
sample
for random sampling with and without replacement.
Distributions for functions for random-variate generation from standard distributions.
Examples
require(stats)
## Seed the current RNG, i.e., set the RNG status
set.seed(42); u1 <- runif(30)
set.seed(42); u2 <- runif(30) # the same because of identical RNG status:
stopifnot(identical(u1, u2))
## the default random seed is 626 integers, so only print a few
runif(1); .Random.seed[1:6]; runif(1); .Random.seed[1:6]
## If there is no seed, a "random" new one is created:
rm(.Random.seed); runif(1); .Random.seed[1:6]
ok <- RNGkind()
RNGkind("Wich") # (partial string matching on 'kind')
## This shows how 'runif(.)' works for Wichmann-Hill,
## using only R functions:
p.WH <- c(30269, 30307, 30323)
a.WH <- c( 171, 172, 170)
next.WHseed <- function(i.seed = .Random.seed[-1])
{ (a.WH * i.seed) %% p.WH }
my.runif1 <- function(i.seed = .Random.seed)
{ ns <- next.WHseed(i.seed[-1]); sum(ns / p.WH) %% 1 }
set.seed(1998-12-04)# (when the next lines were added to the souRce)
rs <- .Random.seed
(WHs <- next.WHseed(rs[-1]))
u <- runif(1)
stopifnot(
next.WHseed(rs[-1]) == .Random.seed[-1],
all.equal(u, my.runif1(rs))
)
## ----
.Random.seed
RNGkind("Super") # matches "Super-Duper"
RNGkind()
.Random.seed # new, corresponding to Super-Duper
## Reset:
RNGkind(ok[1])
RNGversion(getRversion()) # the default version for this R version
## ----
sum(duplicated(runif(1e6))) # around 110 for default generator
## and we would expect about almost sure duplicates beyond about
qbirthday(1 - 1e-6, classes = 2e9) # 235,000