Faye wrote:
>Our survey is structured as : To be investigated area is divided into
>6 regions, within each region, one urban community and one rural
>community are randomly selected, then samples are randomly drawn from
>each selected uran and rural community.
>
>The problems is that in urban/rural stratum, we only have one sample.
>In this case, how to do bootstrap?

You are lucky that your sample size is 1.  If it were 2 you would
probably have proceeded without realizing that the answers were wrong.

Suppose you had two samples in each stratum.  If you proceed naturally,
drawing bootstrap samples of size 2 from each stratum, this would
underestimate variability by a factor of 2.

In general the ordinary nonparametric bootstrap estimates of variability
are biased downward by a factor of (n-1)/n -- exactly for the mean, 
approximately for other statistics.  In multiple-sample and stratified
situations, the bias depends on the stratum sizes.

Three remedies are:
* draw bootstrap samples of size n-1
* "bootknife" sampling - omit one observation (a jackknife sample), then
  draw a bootstrap sample of size n from that
* bootstrap from a kernel density estimate, with kernel covariance equal
  to empirical covariance (with divisor n-1) / n.
The latter two are described in 
Hesterberg, Tim C. (2004), Unbiasing the Bootstrap-Bootknife Sampling vs. 
Smoothing, Proceedings of the Section on Statistics and the Environment, 
American Statistical Association, 2924-2930.
http://home.comcast.net/~timhesterberg/articles/JSM04-bootknife.pdf

All three are undefined for samples of size 1.  You need to go to some
other bootstrap, e.g. a parametric bootstrap with variability estimated
from other data.

Tim Hesterberg

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to