Greetings,
I am attempting to do something with R that I think should be efficiently
do-able, but I haven't yet found success.
I have a vector of probability weights (for 17 categories), let's call it
things (it could look like the one below, for instance).
> things
0.026 0 0.233 0 0.131 0 0.415 0 0 0 0 0 0.192 0 0.067 0 0
I'd like a sample of size size.things (say, 47) of the 17 categories (with
replacement). And I'd like to produce a vector of length 17 which enumerates
the number of times each category has been selected. This is fairly
straightforward to do; for instance:
> things2<-table(factor(sample(1:17,size.things[1],replace=TRUE,prob=things),levels=1:17))
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
1 0 9 0 4 0 18 0 0 0 0 0 5 0 4 0 0
What would I need to do if I had a matrix things (50000 x 17) of probability
weight vectors and a vector of sample sizes size.things (of length 50000), and
I wanted to simultaneously sample size.things[1] of the 17 categories with
probability weight vector things[1,], size.things[2] of the 17 categories with
probability weight vector things[2,], etc. A loop will do the trick, but it
takes a while and it seems to me that I could more efficiently use tapply
somehow. Or something that behaves like rowSums. I'm not familiar enough with R
to see an easy way out. Perhaps there isn't? Does anybody have an idea?
Regards,
Patrick
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.