[R] Efficient sampling from a discrete distribution in R
Issac Trotts
issac.trotts at gmail.com
Tue Sep 4 05:48:00 CEST 2007
Hello r-help,
As far as I've seen, there is no function in R dedicated to sampling
from a discrete distribution with a specified mass function. The
standard library doesn't come with anything called rdiscrete or rpmf,
and I can't find any such thing on the cheat sheet or in the
Probability Distributions chapter of _An Introduction to R_. Googling
also didn't bring back anything. So, here's my first attempt at a
solution. I'm hoping someone here knows of a more efficient way.
# Sample from a discrete distribution with given probability mass function
rdiscrete = function(size, pmf) {
stopifnot(length(pmf) > 1)
cmf = cumsum(pmf)
icmf = function(p) {
min(which(p < cmf))
}
ps = runif(size)
sapply(ps, icmf)
}
test.rdiscrete = function(N = 10000) {
err.tol = 6.0 / sqrt(N)
xs = rdiscrete(N, c(0.5, 0.5))
err = abs(sum(xs == 1) / N - 0.5)
stopifnot(err < err.tol)
list(e = err, xs = xs)
}
Thanks,
Issac
More information about the R-help
mailing list