The obvious:
Take a small sample, say 25-50. Get an estimate of your distribution
from that. Then use this to determine how many more (if any)
additional samples you need for desired precision. This latter can
probably easily be done via simulation/bootstrap if you don't want to
specify a paramet
Basically, we have a population of 4,392 documents and we want to find out
the number of patents per document. We don’t want to go through all 4,392
documents, but want a reliable sample size from which to draw inferences. I
feel like this count data will not follow a normal distribution, but more
2 matches
Mail list logo