Dear R helpers,

I have a question about drawing random numbers from many categorical
distributions.

Consider n individuals, each follows a categorical distribution defined
over k categories.
Consider a simple case in which n=4, k=3 as below

catDisMat <-
rbind(c(0.1,0.2,0.7),c(0.2,0.2,0.6),c(0.1,0.2,0.7),c(0.1,0.2,0.7))

outVec <- rep(NA,nrow(catDisMat))
for (i in 1:nrow(catDisMat)){
outVec[i] <- sample(1:3,1, prob=catDisMat[i,], replace = TRUE)
}

I can think of one way to potentially speed it up (in reality, my n is very
large, so speed matters). The approach above only samples 1 value each
time. I could have sampled two values for c(0.1,0.2,0.7) because it appears
three times. so by doing some manipulation, I think I can have the idea,
"sample(1:3, 3, prob=c(0.1,0.2,0.7), replace = TRUE)",  implemented to
improve speed a bit. But, I wonder whether there is a better approach for
speed?

Thanks in advance.

-Sean

	[[alternative HTML version deleted]]