[R] A question about sampling

Patrick Boily Wed, 02 Feb 2011 14:34:13 -0800

Greetings,

I am attempting to do something with R that I think should be efficiently 
do-able, but I haven't yet found success.


I have a vector of probability weights (for 17 categories), let's call it 
things (it could look like the one below, for instance).

> things
0.026 0 0.233 0 0.131 0 0.415 0 0 0 0 0 0.192 0 0.067 0 0

I'd like a sample of size size.things (say, 47) of the 17 categories (with 
replacement). And I'd like to produce a vector of length 17 which enumerates 
the number of times each category has been selected. This is fairly 
straightforward to do; for instance:

> things2<-table(factor(sample(1:17,size.things[1],replace=TRUE,prob=things),levels=1:17))
 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17
 1  0  9  0  4  0 18  0  0  0  0  0  5  0  4  0  0

What would I need to do if I had a matrix things (50000 x 17) of probability 
weight vectors and a vector of sample sizes size.things (of length 50000), and 
I wanted to simultaneously sample size.things[1] of the 17 categories with 
probability weight vector things[1,], size.things[2] of the 17 categories with 
probability weight vector things[2,], etc. A loop will do the trick, but it 
takes a while and it seems to me that I could more efficiently use tapply 
somehow. Or something that behaves like rowSums. I'm not familiar enough with R 
to see an easy way out. Perhaps there isn't? Does anybody have an idea?

Regards,

Patrick








        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] A question about sampling

Reply via email to