Just to tie up this thread, I wanted to report my solution: When (n-1)p is an integer, there is a closed form solution: pbinom(j-1,n,...)
When it is not an integer, its fairly easy to approximate the solution by interpolating between the closed-form solutions: fitting log(1 - probability from closed form solution) on an orthogonal polynomial in n. This is a _very_ fast and fairly accurate solution. Thanks to all who offered their help... On Thu, Feb 17, 2011 at 11:11 PM, Bentley Coffey <bentleygcof...@gmail.com>wrote: > > Duncan, > > I'm not sure how I missed your message. Sorry. What you describe is what I > do when (n-1)p is an integer so that R computes the sample quantile using a > single order statistic. My later posting has that exact binomial expression > in there as a special case. When (n-1)p is not an integer then R uses a > weighted average of 2 order statistics, in which case I'm left with my > standing problem... > > > On Mon, Feb 14, 2011 at 2:26 PM, Duncan Murdoch > <murdoch.dun...@gmail.com>wrote: > >> On 14/02/2011 9:58 AM, Bentley Coffey wrote: >> >>> I need to calculate the probability that a sample quantile will exceed a >>> threshold given the size of the iid sample and the parameters describing >>> the >>> distribution of each observation (normal, in my case). I can compute the >>> probability with brute force simulation: simulate a size N sample, apply >>> R's >>> quantile() function on it, compare it to the threshold, replicate this >>> MANY >>> times, and count the number of times the sample quantile exceeded the >>> threshold (dividing by the total number of replications yields the >>> probability of interest). The problem is that the number of replications >>> required to get sufficient precision (3 digits say) is so HUGE that this >>> takes FOREVER. I have to perform this task so much in my script >>> (searching >>> over the sample size and repeated for several different distribution >>> parameters) that it takes too many hours to run. >>> >>> I've searched for pre-existing code to do this in R and haven't found >>> anything. Perhaps I'm missing something. Is anyone aware of an R function >>> to >>> compute this probability? >>> >>> I've tried writing my own code using the fact that R's quantile() >>> function >>> is a linear combination of 2 order statistics. Basically, I wrote down >>> the >>> mathematical form of the joint pdf for the 2 order statistics (a function >>> of >>> the sample size and the distribution parameters) then performed a >>> pseudo-Monte Carlo integration (i.e. using Halton Draws rather than R's >>> random draws) over the region where the sample quantile exceeds the >>> threshold. In theory, this should work and it takes about 1000 times >>> fewer >>> clock cycles to compute than the Brute Force approach. My problem is that >>> there is a significant discrepancy between the results using Brute Force >>> and >>> using this more efficient approach that I have coded up. I believe that >>> the >>> problem is numerical error but it could be some programming bug; >>> regardless, >>> I have been unable to locate the source of this problem and have spent >>> over >>> 20 hours trying to identify it this weekend. Please, somebody help!!! >>> >>> So, again, my question: is there code in R for quickly evaluating the CDF >>> of >>> a Sample Quantile given the sample size and the parameters governing the >>> distribution of each iid point in the sample? >>> >> >> I think the answer to your question is no, but I think it's the wrong >> question. Suppose Xm:n is the mth sample quantile in a sample of size n, >> and you want to calculate P(Xm:n > x). Let X be a draw from the underlying >> distribution, and suppose P(X > x) = p. Then the event Xm:n > x >> is the same as the event that out of n independent draws of X, at least >> n-m+1 are bigger than x: a binomial probability. And R can calculate >> binomial probabilities using pbinom(). >> >> Duncan Murdoch >> >> > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.