Dear all, My question might be more of a statistics question than a question on R, although it's on how to apply the 'quantreg' package. Please accept my apologies if you believe I am strongly misusing this list.
To be very brief, the problem is that I have data on only a random draw, not all of doctors' patients. I am interested in the, say, median number of patients of doctors. Does it suffice to use the "nid" option in summary.rq? More specifically, if the model generating the number of patients, say, r_i, of doctor i is r_i = const + u_i, then I think I would obtain the median of the number of doctors' patients using rq(r~1, ...) and plugging this into summary.rq() using the option se="iid". Unfortunately, I don't observe r_i in the data but, instead, in the data I only have a fraction p of these r_i patients. In fact, with (known) probability p a patient is included in the data. Thus, for each doctor i the number of patients IN THE DATA follows a binomial distribution with parameters r_i and p. For each i I now have s_i patients in the data where s_i is a draw from this binomial distribution. That is, the problem with the data is that I don't observe r_i but s_i. Simple montecarlo experiments confirm my intuition that standard errors should be larger when using the "noisy" information s_i/p instead of (the unobserved) r_i. My guess is that I can consistently estimate any quantile of the number of doctors' patients AND THEIR STANDARD ERRORS using the quantreg's rq command: rq(I(s/p)~1, ...) and the summary.rq() command with option se="nid". Am I correct? I am greatful for any help on this issue. Best regards, Thorsten Vogel ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.