# [R] help with pseudo-random numbers

Faheem Mitha faheem at email.unc.edu
Mon Aug 12 05:16:10 CEST 2002

```Dear People,

I have a vexing problem related to pseudo-random number generation, and
would appreciate any help and advice. This problem is not directly related
to R, and the only reason I am posting it to this list is that my
implementation is using R. Let me describe my problem by giving an
example, that is close to what I am trying to do.

Suppose we are given a stream of pseudo-random numbers, U_1, U_2...
intended to simulate a sequence of iid rvs from Unif[0,1], and a
probability distribution (specified as a pdf f or cdf F), and are required
to obtain a sequence of rvs X_1, X_2... such that X_i is a deterministic
function of U_i.  Let us further suppose that F^{-1} is difficult to
calculate, so one does not simply do X_i = F^{-1}(U_i).

Now, a possible alternative would be use U_i to set a random seed for a
sequence of pseudo-random numbers, V_{i1}, V_{i2},... Then we can use this
V_{ij} stream for every i to simulate F using the basic rejection method.
(Here we further assume that there is some available pdf g we can simulate
from such that f <= Mg for some const M).

So we have a picture that looks like

U_1 (random number seed for): V_{1,1} V_{1,2} V_{1,3}...
U_2 (random number seed for): V_{2,1} V_{2,2} V_{2,3}...
...

However, this causes a difficulty, which is that the sequences V_{ij} for
different i will (probably) not be independent. As far as I known,
pseudo-random numbers are not meant to be used in this fashion. However,
assuming we're stuck with this situation, the question is how to mitigate
this problem so that it becomes less severe. Two possibilities come to my
mind.

1) Use two different random number generators to generate the U's and
generate the V's. If no other measures are taken, then using the same
random number generator for both seems unwise, since then (for example)
V_{1,1} and U_2 would be the same.

2) Discard part of the V stream for each i. Ie. for each i, we discard
V_{i,1}, V_{i,2},... V_{i,N} for some fixed N, and only use the numbers
from V_{i,N+1} onwards. This measure might lessen the risk of correlation,
though I don't know enough to say whether this would actually be
effective.

1) and 2), and if you think they are reasonable, tell me

1') What two random number generators should I use? I believe the R
default is "Marsaglia-Multicarry".

2') What should I take N to be? Clearly, I'd like to make it as small as
possible.

Any other suggestions you think would be better would also be gratefully

Now, I have another question relating to how I could actually implement
the above scenario. The context in which I am working is mixed R/C++ code,
using compiled C++ code loaded into R as a shared library. Let us suppose
I am calling a C++ function main() from R. Now, when I call unif_rand() in
the C++ code, I assume I am using the random seed that was passed down
somehow from R. (I'd like to know how this is done if anyone can tell me).
Repeated calls to unif_rand() presumably correspond therefore to the U_i
above. Now, I'd like suggestions for the cleanest/easiest way to switch to
generating the sequence V_{ij} after setting the seed as U_i and then
continue generating the U's when I have done generating a sequence of
V_{ij}. There does not seem to be any very automatic way of handling such
a setup using already existing mechanisms.