Re: [R] svymeans question

Thomas Lumley Thu, 28 Aug 2008 11:07:31 -0700

Other people have explained that the issue is missing data. I just wantedto note that the reason for using only the complete cases on all variablesis that svymeans() computes the covariance matrix of all the means, andthis can't really be done sensibly when the means are based on differentsubsets.


        -thomas


On Tue, 26 Aug 2008, Doran, Harold wrote:

I have the following code which produces the output below it

clus1 <- svydesign(ids = ~schid, data = lower_dat)
items <-  as.formula(paste(" ~ ", paste(lset, collapse= "+")))
rr1 <- svymean(items, clus1, deff='replace', na.rm=TRUE)

rr1

           mean       SE   DEff
W525209 0.719748 0.015606 2.4932
W525223 0.508228 0.027570 6.2802
W525035 0.827202 0.014060 2.8561
W525131 0.805421 0.015425 3.1350
W525033 0.242982 0.020074 4.5239
W525163 0.904647 0.013905 4.6289
W525165 0.439981 0.020029 3.3620
W525167 0.148112 0.013047 2.7860
W525177 0.865924 0.014977 3.9898
W525179 0.409003 0.020956 3.7515
W525181 0.634076 0.022076 4.3372
W525183 0.242498 0.019073 4.0894
W525401 0.262343 0.021830 3.4354
W525059 0.854792 0.016551 4.5576
W525251 0.691191 0.025010 6.0512
W525083 0.433204 0.017310 2.5200
W525289 0.634560 0.012762 1.4504
W524763 0.791868 0.014478 2.6265
W524765 0.223621 0.019627 4.5818
W524951 0.242982 0.016796 3.1669
W524769 0.820910 0.016786 3.9579
W524771 0.872701 0.015853 4.6712
W524839 0.518877 0.026433 5.7794
W525374 1.209584 0.043065 5.1572
W524885 0.585673 0.027780 6.5674
W525377 1.100678 0.050093 5.8851
W524787 0.839303 0.012994 2.5852
W524789 0.339787 0.019230 3.4041
W524791 0.847047 0.012885 2.6461
W524825 0.500968 0.021988 3.9935
W524795 0.868345 0.014951 4.0377
W524895 0.864472 0.013872 3.3917
W524897 0.804937 0.020070 5.2977
W524967 0.475799 0.032137 8.5511
W525009 0.681994 0.018670 3.3188

However, when I do the following:

svymean(~W524787, clus1, deff='replace', na.rm=TRUE)
           mean       SE   DEff
W524787 0.855547 0.011365 4.1158

Compare this to the value in the row 9 up from the bottom to see it is
different.

Computing the mean of the item by itself with svymeans agrees with the
sample mean

mean(lower_dat$W524787, na.rm=T)

[1] 0.8555471

Now, I know that there is a covariance between the variables, but I was
under the impression that the sample mean was still of pragmatic
utility, but to account for sample design only the standard error is
affected.

In the work I am doing, it is important for the means of the items from
svymeans to be the same as the sample mean when it is computed by
itself. It's a bit of a story as to why, and I can provide that info if
relevant.

I don't see an argument in svydesign or in svymean that would allow for
me to treat the variables as being independent. But, maybe I am missing
something else and would welcome any reactions.

Harold

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Thomas Lumley                   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]       University of Washington, Seattle

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] svymeans question

Reply via email to