On Tue, 27 Oct 2009 12:11:42 -0700 (PDT) Ben Bolker <bol...@ufl.edu> wrote: > This is not quite right because we have estimated the > rate from the data -- from ?ks.test > ... > > But perhaps not a bad start.
Actually, it is a very bad start. Using estimated parameters in tests like ks.test gives you a *completely* wrong distribution of the test statistic and the resulting p-value. Here's a simple example: library(MASS) n=20 r=1 f=function(n,r) { x=rexp(n,rate=r); ks.test(x,"pexp",rate=r)$p.value } g=function(n,r) { x=rexp(n,rate=r); ks.test(x,"pexp",rate=1/mean(x))$p.value } truehist(replicate(1000, f(n,r)), h=.1, col="wheat") truehist(replicate(1000, g(n,r)), h=.1, col="wheat") Note that increasing the number of observations n does *not* help. Also note that under the null distribution, the parameter estimation mostly has an effect on the power; i.e., it *reduces* the probability of a type I error, and very much so. I'm not sure what the effect under the non- null alternative is, but I know there have been written several papers on this topic. -- Karl Ove Hufthammer ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.