R-alpha: set.seed(.) [was 'compatibility']

Martin Maechler Martin Maechler <maechler@stat.math.ethz.ch>
Mon, 15 Sep 1997 15:48:09 +0200


>>>>> Thomas Lumley writes:
    >> Two compatibility issues found while trying to convert a simulation
    >> from S to R.

    >> 1. set.seed()  We don't have this function.  According to Venables &
    >> Ripley it just picks a seed from a list of 1000 possibilities. How about
    >>
    >> "set.seed" <-function (i) 
    >> {
    >>         if (any(i > 1000) | any(i < 1)) 
    >>                 stop("argument to set.seed() must be in 1:1000")
    >>         if (any(i != trunc(i))) 
    >>                 stop("argument to set.seed()must be an integer")
    >>         .Random.seed <<- as.integer((200 * c(i, i^2, i^3))%%c(30269, 
    >>                 30307, 30323))
    >> }

Thank you Thomas,
yes I agree that we should have a  set.seed(.).

{{now follows a longish explanation about what S-plus does, etc..
  at the end a better (?) proposal for set.seed 
}}

When I first thought about implementing  set.seed(.) in R,
I tried to understand what S-plus does (and did not succeed).
In the mean time, I have re-investigated and found almost ``everything'' :

S-plus 3.4 *help* says

Sp3.4>>     DESCRIPTION:
Sp3.4>> 	   Puts the random number generator in a reproducible state.
Sp3.4>> 
Sp3.4>>     USAGE:
Sp3.4>> 	   set.seed(i)
Sp3.4>> 
Sp3.4>>     REQUIRED ARGUMENTS:
Sp3.4>>     i:     be an integer between 0 and 1000.
Sp3.4>> 
Sp3.4>>     SIDE EFFECTS:
Sp3.4>> 	   Sets the value of the .Random.seed object in  the  working
Sp3.4>> 	   directory.
Sp3.4>> 
Sp3.4>>     DETAILS:
Sp3.4>> 	   Random number generators in S-PLUS are all  based  upon  a
Sp3.4>> 	   single  uniform  random  number  generator  that generates
Sp3.4>> 	   numbers in a very long, but ultimately periodic  sequence.
Sp3.4>> 	   The   position   in   the   sequence  is  held  in  object
Sp3.4>> 	   .Random.seed.  Function set.seed sets .Random.seed so that
Sp3.4>> 	   subsequent  calls  to  random  number  generator functions
Sp3.4>> 	   (runif, rnorm, etc.)  will generate  numbers  from  a  new
Sp3.4>> 	   portion of the overall cycle.
Sp3.4>> 
Sp3.4>> 	   Random  number  generation  in  S-PLUS  is  adapted   from
Sp3.4>> 	   Marsaglia  (1973);  see  Kennedy  and  Gentle  (1980)  for
Sp3.4>> 	   background information.

So they __say__   ``i in 0:1000'', but the
problem is that set.seed(i) also ``works'' with any other integer i.

But what exactly should happen?
The crucial sentence is (as vague as)
 Sp3.4>> Function set.seed sets .Random.seed so that
 Sp3.4>> subsequent calls to random number generator functions (runif...) 
 Sp3.4>> will generate numbers from a new portion of the overall cycle.

The intent being that for ANY i,j (i!=j) set.seed(i) and set.seed(j) would
choose ``distant'' parts of the periodic cycle.

Unfortunately, we don't know what exactly happens in
the S-plus code for set.seed(i), namely,  .C("setseed", as.integer(i))
However, I have found out the following which should suffice :

 1) Only .Random.seed[4:6]  are varyied by set.seed, i.e., 
   after set.seed(i), .Random.seed always is

	   1  2  3  4  5  6  7  8  9 10 11 12	
      [1] 21 14 49  *  *  * 32 22 36 23 28  3
		   X4 X5 X6

   Which means that only the 'congrval' {Venables & Ripley, 2nd ed., p.166-167}
   is varied	

 2) X4 is always ==  16*((i+2) %% 4)
    X5 takes (all) values in 0:63  [also with some periodicity]
    X6 takes the values of 0:3

 3) set.seed(i) <==> set.seed(i + n*1024)  for all integers n, i

 4) { set.seed(i) ;  i \in {integers}} 
    results in a set of 1024 different values of 'congrval'.
    These values are EXACTLY EQUISPACED lying in {0,..,2^32}.

    If C[1:1024] are these sorted values, we have

	C[i] == 201621 + (i-1)* 2^22  , for i in 1:1024
	


So far for S-plus.

-----------------------------------------------------------------

Now, in R with a different basic RNG (with a longer period), 
		-- see ?.Random.seed --
we would want  set.seed(i)
to also guarantee that the locations in the periodic cycle are far apart.

I think (but may be wrong; I must do other things now !!)
this can be better achieved by just changing one of the 3 congruential
generators involved.
This leads to the following proposal :

set.seed <- function (i) 
{
        if(length(i) != 1  || i != (ii <- as.integer(i)))
	  stop("argument to set.seed() must be integer(1)")
	i <- ii
        if (i > 1024 || i < 1) {
	  warning("set.seed(i) has i outside 1:1024; taking  i %% 1024")
	  i <- i %% 1024; if(i <= 0) i <- i+ 1024
	}
        invisible(.Random.seed <<- as.integer(c((171*i)%%30269, 10909, 10111)))
}



Martin Maechler <maechler@stat.math.ethz.ch>		 <><
Seminar fuer Statistik, SOL G1
ETH (Federal Inst. Technology)	8092 Zurich	 SWITZERLAND
phone: x-41-1-632-3408		fax: ...-1086
http://www.stat.math.ethz.ch/~maechler/
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-