Hi all, How can we use ks.test() to evaluate goodness of fit of mixtures of distributions?
For example I have the following dataset: > x [1] 176.1 176.8 259.6 171.6 90.0 234.3 145.7 113.7 105.9 176.2 168.9 136.1 [13] 109.2 110.3 164.3 117.7 131.3 163.7 200.4 196.4 196.2 168.6 190.4 127.5 [25] 136.0 114.2 112.0 91.9 333.4 295.5 172.0 293.3 91.7 289.7 118.8 55.1 [37] 161.9 233.9 197.7 118.4 139.1 189.8 45.6 167.8 53.1 86.2 148.4 80.3 [49] 105.8 160.6 217.1 134.4 103.1 221.6 163.8 171.5 195.1 201.5 145.3 97.4 [61] 287.9 352.6 173.6 85.7 182.0 166.4 175.4 224.4 167.2 143.7 168.9 205.3 [73] 192.8 203.7 195.5 193.6 201.2 280.9 159.8 115.4 113.5 216.5 140.0 164.6 [85] 341.3 301.8 146.1 182.6 263.5 318.0 168.5 205.2 204.7 213.0 250.2 265.9 [97] 215.4 344.3 191.1 175.1 188.9 206.0 127.5 148.0 172.7 193.1 150.7 195.5 [109] 142.4 232.3 92.7 127.6 227.0 358.5 202.4 224.6 374.6 220.8 173.8 120.6 [121] 265.0 151.1 Having fitted with gamma mixture I obtain the following parameters: > param_of_2comp comp.1 comp.2 alpha 5.911165 95.662958 beta 30.455212 1.927715 > lmbd [1] 0.8058989 0.1941011 > ks <- ks.test(w,"gamma_pdf",lmbd,alpha,beta,noc) > print(ks$statistics[["D"]]) 0.80599 > print(ks$p.value) 0 The problem is that over many datasets the ks.test() always give Pvalue = 0 and very high "D" for testing multiple components. Morever the plot I have, the two components fits well to the data. http://docs.google.com/View?docid=dcvdrfrh_3cm63hcfn (Gamma is at the bottom). What's wrong with my approach above? And this is the pdf function of gamma I used. __BEGIN__ gamma_pdf <- function(x,lambda, alpha,beta, k){ temp<-NULL al = alpha[1:k] be = beta[1:k] for(j in 1:k){ # each being gamma distribution temp=cbind(temp,pgamma(x,shape=al[j],scale=be[j])) } temp=t(lambda*t(temp)) as.vector(temp) } __END__ - Gundala Viswanath Jakarta - Indonesia ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.