[R] random number generation issues with r and compiled C code

Thomas Lumley tlumley at u.washington.edu
Tue Jan 29 18:06:17 CET 2002


On Tue, 29 Jan 2002 hzi at uol.com.br wrote:

>
> 	One question about "random number generation": AFAIK, true random
> number generation has to receive random input from the "outside." For
> instance, when you generate GnuPG or PGP keys for encryption, you have
> to type something on the keyboard.
> 	With all the other stuff I've seen that purports to generate
> random numbers, real randomo is no being generated, right?
> 	Shoudn't R have some function similar to the above mentioned one
> that exists in PGP or GNuPG?
>

No, but some other packages should :).  There are three concepts all
called, loosely, `random number generators'.

1) Statisticians are happy with generators where the numbers are uniformly
distributed in one, two,... some reasonable number of dimensions and the
stream has a period much longer than the number of numbers needed.  Being
able to rerun the same sequence of random numbers is also important. These
generators are fairly easy to program and fast, and will only break down
if there is some weird conspiracy between the generator and the statistic
being calculated.

2) Cryptographers deal with the situation where there *is* a conspiracy to
make the generator and the statistic interact badly, so they aren't
satisfied with our random number generators. They need generators where
it is infeasible to guess anything about future `random' numbers from
observing past ones. These are of two subtypes

2a) Deterministic, cryptographically strong generators: these use an
encryption function (a cipher like triple DES or a secure hash) to update
the random numbers.  Given some of the random numbers you shouldn't be
able to guess others without knowing the key.  These tend to be slower
than statistical random number generators and have no real advantage in
cases where your statistic isn't out to get you.

2b) Deterministic generators are very sensitive to someone finding the
key, so encryption software often uses a generator that mixes in
information that should be unpredictable to outsiders (eg timings of
keypresses, disk latencies). This allows recovery if someone finds out or
guesses the key.  As these generators are not deterministic they are less
useful for statisticians -- they can't be rerun. In addition, the amount
of genuine randomness available may not be very large. Since R is
frequently used to produce millions of random numbers per hour (and
sometimes much more) there just isn't enough entropy in these sources even
if we wanted to use it.


So, R doesn't provide cryptographically secure random number generators
because we're not security experts and that isn't what R is for. Other
statistical packages don't either.  Don't use R to write an email
encryption program.


	-thomas



-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list