Dear R helpers, I have a question about drawing random numbers from many categorical distributions.
Consider n individuals, each follows a categorical distribution defined over k categories. Consider a simple case in which n=4, k=3 as below catDisMat <- rbind(c(0.1,0.2,0.7),c(0.2,0.2,0.6),c(0.1,0.2,0.7),c(0.1,0.2,0.7)) outVec <- rep(NA,nrow(catDisMat)) for (i in 1:nrow(catDisMat)){ outVec[i] <- sample(1:3,1, prob=catDisMat[i,], replace = TRUE) } I can think of one way to potentially speed it up (in reality, my n is very large, so speed matters). The approach above only samples 1 value each time. I could have sampled two values for c(0.1,0.2,0.7) because it appears three times. so by doing some manipulation, I think I can have the idea, "sample(1:3, 3, prob=c(0.1,0.2,0.7), replace = TRUE)", implemented to improve speed a bit. But, I wonder whether there is a better approach for speed? Thanks in advance. -Sean [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.