Faye wrote: >Our survey is structured as : To be investigated area is divided into >6 regions, within each region, one urban community and one rural >community are randomly selected, then samples are randomly drawn from >each selected uran and rural community. > >The problems is that in urban/rural stratum, we only have one sample. >In this case, how to do bootstrap?
You are lucky that your sample size is 1. If it were 2 you would probably have proceeded without realizing that the answers were wrong. Suppose you had two samples in each stratum. If you proceed naturally, drawing bootstrap samples of size 2 from each stratum, this would underestimate variability by a factor of 2. In general the ordinary nonparametric bootstrap estimates of variability are biased downward by a factor of (n-1)/n -- exactly for the mean, approximately for other statistics. In multiple-sample and stratified situations, the bias depends on the stratum sizes. Three remedies are: * draw bootstrap samples of size n-1 * "bootknife" sampling - omit one observation (a jackknife sample), then draw a bootstrap sample of size n from that * bootstrap from a kernel density estimate, with kernel covariance equal to empirical covariance (with divisor n-1) / n. The latter two are described in Hesterberg, Tim C. (2004), Unbiasing the Bootstrap-Bootknife Sampling vs. Smoothing, Proceedings of the Section on Statistics and the Environment, American Statistical Association, 2924-2930. http://home.comcast.net/~timhesterberg/articles/JSM04-bootknife.pdf All three are undefined for samples of size 1. You need to go to some other bootstrap, e.g. a parametric bootstrap with variability estimated from other data. Tim Hesterberg ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.