# [R] Simulation (multinomial deviate generation)

ben@zoo.ufl.edu ben at zoo.ufl.edu
Wed Dec 6 17:46:32 CET 2000

```  I've collected various implementations of multinomial deviate generation
in S/R.  There's an "rmult" package by Jonathan Rougier on STATLIB's S
archive (http://lib.stat.cmu.edu/S/rmult), which runs unmodified under R,
but is much slower than your code #1.

The following code (which I didn't write but have tweaked -- it could be
improved further; unfortunately I don't have the author's name any more)
is much faster *if* you can generate all of your deviates at the same
time, but twice as slow if you do them one at a time.

## (rmult.gb1 and rmult.gb2 are just your code)

tot <- 100000
system.time(for (i in 1:tot) rmult.gb1(5,10))
## [1] 21.27  1.02 22.37  0.00  0.00
system.time(for (i in 1:tot) rmult.gb2(5,10))
## [1] 137.78   3.59 142.22   0.00   0.00
system.time(for (i in 1:tot) rmulti(1,rep(1/5,5),10))
## [1] 50.07  1.05 51.36  0.00  0.00
system.time(rmulti(tot,rep(1/5,5),10))  ## vectorize
## [1] 0.17 0.02 0.19 0.00 0.00

rmulti<-function (n, size, prob)
## pick multinomial deviates
{
if (size==0) stop("rmulti: size==0")
if (is.matrix(prob)) {
l <- ncol(prob)
totp <- prob %*% rep(1, l)
}
else {
l <- length(prob)
totp <- sum(prob)
prob <- matrix(prob, nrow = 1)
}
siz <- rep(size, length.out = n)
x <- matrix(numeric(n * l), ncol = l)
for (i in 1:(l - 1)) {
## inserted min() to deal with roundoff problems that led to p>1.0
x[, i] <- rbinom(n, siz, min(prob[, i]/totp,1.0))
siz <- siz - x[, i]
totp <- totp - prob[, i]
totp[totp <= 0] <- 1
}
x[,l]<-siz
x
}

On Wed, 6 Dec 2000, gb wrote:

>
> I want to draw a random sample Y_1, ..., Y_n, iid Poisson,
> but condition on their sum being equal to  k. Two
> suggestions:
>
> 1)   lambda <- k / n ## optimal value
>      ysum <- -1
>      while (ysum != k){
>         y <- rpois(n, lambda)
>         ysum <- sum(y)
>      }
>      y
>
> 2) Aiming at Multinom(k, 1/n, ..., 1/n):
>
>    y <- as.vector(table(factor(sample(n, k, replace = T), levels = 1:n)))
>
> 1) seems to be faster than 2), but my question is whether there is
> something even better?  'sample' is fast enough, but how can I do
> the rest (factor, table) smarter in 2)?
>
> Thanks for any suggestions!
>
> Göran
>

--
318 Carr Hall                                bolker at zoo.ufl.edu
Zoology Department, University of Florida    http://www.zoo.ufl.edu/bolker
Box 118525                                   (ph)  352-392-5697
Gainesville, FL 32611-8525                   (fax) 352-392-3704

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

```