On Thu, 3 Jun 2010, Roni Kobrosly wrote:
Hello,
I'm using a complex survey dataset and my goal is to simply spit out a bunch of
probability-weighted outcome variable means for the different levels of
covariate. So I first define the structure of the study design (I'm using the
CDC's NHANES data):
dhanes <- svydesign(id=~PSU, strat=~STRATA, weight=~lab_weight, data=final,
nest=TRUE)
No problem there.
Now I use the "svyby" function as follows:
svyby(~outcome, ~covariate, design=dhanes, svymean, na.rm=T) -> haha
print(haha)
covariate outcome se.outcome
1 1 0.4961189 0.08828457
2 2 0.4474706 0.22214557
3 3 0.5157026 0.12076008
4 4 0.6773910 0.20605025
NA NA 0.8728167 0.15622274
...and it works just fine. I get a nice table of the mean and standard error
for each level of the covariate. I started writing a custom function to
automate this and I had problems. Consider this really basic custom function
(that does not seem very different from the above code):
this_is_a_test <-function(outcome, covariate)
{
svyby(~outcome, ~covariate, design=dhanes, svymean, na.rm=T) -> haha
print(hah)
}
You are asking for the mean of a variable called 'outcome', divided up
according to a variable called 'covariate'. Presumably you don't have variables
with either of those names, so R is getting confused.
Formulas don't work the way you want them to. As a simpler example with
nothing to do with the survey package
this_is_a_simpler_example<-function(outcome){
~outcome
}
this_is_a_simpler_example(test)
~outcome
If you want to substitute a variable into a formula, you need to do it
yourself. In your case, you probably want to use make.formula(), from the
survey package
make.formula("test")
~test
make.formula(c("fred","barney","wilma"))
~fred + barney + wilma
Presumably you want to do something like
approach_that_works <-function(outcome, covariate, design=dhanes,...)
svyby(make.formula(outcome), make.formula(covariate), design,...)
some_outcomes <- colnames(dhanes)[47:63]
some_covariates <- colnames(dhanes)[83:95]
lapply( some_outcomes,
function(an_outcome) lapply(some_covariates,
approach_that_works, outcome=an_outcome)
)
For another recent thread using another approach to a related question, see
http://tolstoy.newcastle.edu.au/R/e10/help/10/05/5676.html
-thomas
Thomas Lumley Assoc. Professor, Biostatistics
tlum...@u.washington.edu University of Washington, Seattle
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.