[Rd] simulate in stats

Thu Sep 15 18:07:48 CEST 2005

BTW, I think there is a problem with the way the argument "seed" is used 
in the new simulate in stats.  The problem is that repeated calls to 
simulate using the default argument will introduce a new pattern into 
the RNG:

 > stats:::simulate
function (object, nsim = 1, seed = as.integer(runif(1, 0, 
.Machine$integer.max)),   ...)
UseMethod("simulate")
<environment: namespace:stats>

 > stats:::simulate.lm
function (object, nsim = 1, seed = as.integer(runif(1, 0, 
.Machine$integer.max)),    ...)
{
    if (!exists(".Random.seed", envir = .GlobalEnv))
        runif(1)
    RNGstate <- .Random.seed
    set.seed(seed)
  ...

This should not be done, as the resulting RNG has not been studied or 
proven. A better mechanism is  to have a default argument equal NULL, 
and not touch the seed in that case. There are several examples of this 
in the package dse1 (in bundle dse),  see for example simulate.ARMA and 
simulate.SS. They also use the utilities in the setRNG package to save 
more of the information necessary to reproduce simulations. Roughly it 
is done like this:

simulate.x <- function (model, rng = NULL,  ...)
  {if (is.null(rng)) rng <- setRNG() #returns the RNG setting to be 
saved with the result
    else {
        old.rng <- setRNG(rng)
        on.exit(setRNG(old.rng))
        }
   ...

The seed by itself is not very useful if the purpose is to be able to 
reproduce things, and I think it would be a good idea to incorporate the 
few small functions setRNG into stats (especially if the simulate 
mechanism is being introduced).

The argument "nsim" presumably alleviates to some extent the above 
concern about changing the RNG pattern. However, in my fairly extensive 
experience it is not very workable to produce all the simulations and 
then do the analysis of them. In a Monte Carlo experiment the generated 
data set is just too big. A better approach is to do the analysis and 
save only necessary information after each simulation. That is the 
approach, for example, in dse2:::EstEval.

Paul

Paul Gilbert wrote:

> Can the arguments nsim and seed be passed as part of ... in the new 
> simulate generic in R-2.2.0alpha package stats?
>
> This would potentially allow me to use the stats generic rather than 
> the one I define in dse. There are contexts where nsim and seed do not 
> make sense. I realize that the default arguments could be ignored, but 
> it does not really make sense to introduce a new generic with that in 
> mind. (I would also prefer that the "object" argument was called 
> "model" but this is less important.)
>
> Paul Gilbert