In his "Introduction to Probability Models" Sheldon Ross describes (sec 11.4.1, 8th edition) the alias method for such weighted sampling. It is based on some decomposition of the original distribution (the weights) into a mixture of two-point distributions. I don't know the run-time complexity of the decomposition step. But once the decomposition is found the complexity of generating a sample is linear in the size of the sample.
I haven't coded this because I had found an easier way to deal with the original problem I had at that time. HTH, Vadim > -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Bo Peng > Sent: Tuesday, June 21, 2005 9:24 AM > To: r-devel@r-project.org > Subject: [Rd] efficiency of sample() with prob. > > Dear list, > > A while ago, Vadim asked opinions on improving efficiency of > sample() with prob, e.g. sample with replacement with weight. > ( > https://stat.ethz.ch/pipermail/r-devel/2004-September/030844.h tml ) He did not post what he ended up with this problem though. > > I am having exactly the same problem. I need to sample with > replacement from a population of size 10 million with fitness > values for each individual. sample() is too slow for this purpose. > > I implement a bisection search algorithm. It is about 30% > faster than the linear search one but still not good enough > to me. (I can post the function if needed). Does anybody have > some good ideas? The only other idea I have is using a faster > (but worse) random number generator just for this application. > > Thanks. > Bo > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel