Dear Thomas,


Thanks a lot for your reply which helps a lot.



*From:* Thomas Lumley-2 [via R] [mailto:
[email protected]]
*Sent:* 25 September 2012 22:49
*To:* Kristof
*Subject:* Re: Three Stage Sampling of categorical variable using 'survey'
in R



On Wed, Sep 26, 2012 at 12:45 AM, Kristof <[hidden
email]</user/SendEmail.jtp?type=node&node=4644179&i=0>>
wrote:

> 1) SURVEY DESIGN
> So far  I designed mainly two stage cluster surveys but never did a three
> stage cluster survey design. It seems that in the analysis only the PSU
is
> taken into account and enumeration area. So whatever happens at the
second
> stage seems irrelevant to the analysis which seem odd to me.

There are two issues here that aren't the same.

If you don't provide population size information the analysis depends
only on the PSU, strata, weights, and measurements.

*[KB=>] Population is 38.8 Million (we can have an exact figure and make a
fairly accurate projection based on population growth rate.  Sample at this
time is around 7800 so around 5% of the sample.*

If the sample is
much smaller than the population, then even if you do provide
population size information the analysis essentially depends only on
the PSU, strata, weights, and measurements.

*[KB=>] I assume 5% is small and no FPC is required*



This doesn't mean that the design doesn't matter after stage 1, it
just means that the weights and the distribution of the measurements
tells you everything about the subsequent stages that you need to
know.  In particular, the variability in weight*data is important, and
different designs can give very different standard errors.

*[KB=>] Thanks for that this is clear PSU information to determine the deff
and all stages below are accounted in the sample weight*


The same design principles apply at later stages of design as at stage
1:  stratifying on a variable correlated with the variable of interest
will increase precision, and the Neyman allocation formula still tells
you how to choose stratum sizes based on what you know about variance
and cost.

*[KB=>] That is clear*

  It's harder to optimize a multistage design because there
are many more options and which design is best will depend on a lot of
things you don't know, but it's not intrinsically different from
optimising a single-stage design.

*[KB=>] *Now I see how the other staegs are taken into account in the
analysis I can see that



> Our intention was to do a PPM at the first and the second stage and have
> same size takes  in each enumeration area.
> The design would be to select 50 out of 150 Upazila's (sub-districts) as
> PSU using probability proportionate to size.
> The second stage would be 6 village-groups out of an average of 250
> village-groups per Upazila using PPS
> use SRS to select 26 households in each of the 6 selected villages per
> Upazila. Total sample size 7800
> Household is the BSU and where we need to calculate information on the
> individual level we are confident to be able to correct the sample
weights
> for that.


That sounds plausible
*[KB=>] *Reassuring to here and thanks for the demystification.  The
unknown seem sometimes more daunting than it is. Will discuss this with the
team but I’m sure they are grateful for your help.

   -thomas

*[KB=>] *Kristof
-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

______________________________________________
[hidden email] </user/SendEmail.jtp?type=node&node=4644179&i=1> mailing
list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

------------------------------

*If you reply to this email, your message will be added to the discussion
below:*

http://r.789695.n4.nabble.com/Three-Stage-Sampling-of-categorical-variable-using-survey-in-R-tp4644110p4644179.html

To unsubscribe from Three Stage Sampling of categorical variable using
'survey' in R, click
here<http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4644110&code=Ym9zdG9lbkBpcmMubmx8NDY0NDExMHw5Njg1MzMzOTY=>
.
NAML<http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>

-- 

<http://www.irc.nl/page/73684>




--
View this message in context: 
http://r.789695.n4.nabble.com/Three-Stage-Sampling-of-categorical-variable-using-survey-in-R-tp4644110p4644200.html
Sent from the R help mailing list archive at Nabble.com.
        [[alternative HTML version deleted]]

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to