[Rd] user supplied random number generators

Ross Boylan ross at biostat.ucsf.edu
Thu Jul 30 08:21:13 CEST 2009


?Random.user says (in svn trunk)
  Optionally,
  functions \code{user_unif_nseed} and \code{user_unif_seedloc} can be
  supplied which are called with no arguments and should return pointers
  to the number of seeds and to an integer array of seeds.  Calls to
  \code{GetRNGstate} and \code{PutRNGstate} will then copy this array to
  and from \code{.Random.seed}.
And it offers as an example
  void  user_unif_init(Int32 seed_in) { seed = seed_in; }
  int * user_unif_nseed() { return &nseed; }
  int * user_unif_seedloc() { return (int *) &seed; }

First question: what is the lifetime of the buffers pointed to by the
user_unif-* functions, and who is responsible for cleaning them up?  In
the help file they are static variables, but in general they might be
allocated on the heap or might be in structures that only persist as
long as the generator does.

Since the example uses static variables, it seems reasonable to conclude
the core R code is not going to try to free them.

Second, are the types really correct?  The documentation seems quite
explicit, all the more so because it uses Int32 in places.  However, the
code in RNG.c (RNG_Init) says

	    ns = *((int *) User_unif_nseed());
	    if (ns < 0 || ns > 625) {
		warning(_("seed length must be in 0...625; ignored"));
		break;
	    }
	    RNG_Table[kind].n_seed = ns;
	    RNG_Table[kind].i_seed = (Int32 *) User_unif_seedloc();
consistent with the earlier definition of RNG_Table entries as
typedef struct {
    RNGtype kind;
    N01type Nkind;
    char *name; /* print name */
    int n_seed; /* length of seed vector */
    Int32 *i_seed;
} RNGTAB;

This suggests that the type of user_unif_seedloc is Int32*, not int *.
It also suggests that user_unif_nseed should return the number of 32 bit
integers.  The code for PutRNGstate(), for example, uses them in just
that way.

While the dominant model, even on 64 bit hardware, is probably to leave
int as 32 bit, it doesn't seem wise to assume that is always the case.

I got into this because I'm trying to extend the rsprng code; sprng
returns its state as a vector of bytes.  Converting these to a vector of
integers depends on the integer length, hence my interest in the exact
definiton of integer.  I'm interested in lifetime because I believe
those bytes are associated with the stream and become invalid when the
stream is freed; furthermore, I probably need to copy them into a buffer
that is padded to full wordlength.  This means I allocate the buffer
whose address is returned to the core R RNG machinery.  Eventually
somebody needs to free the memory.

Far more of my rsprng adventures are on
http://wiki.r-project.org/rwiki/doku.php?id=packages:cran:rsprng.  Feel
free to read, correct, or extend it.

Thanks.

Ross Boylan



More information about the R-devel mailing list