Monica, I have a few thoughts.
- (I believe) it is usually better to put confidence in these metrics instead of relying on p-values. The intervals will allow you to make inferential statements and give you a way of characterizing the uncertainty in the estimates. You've seen how to do this with accuracy. For Kappa, there is probably an analytical formula for a CI, but I don;t know that it is in R. I would use the bootstrap (bia the boot or bootstrap package) to get intervals for kappa. - It sounds like some of the models were generated outside of R. I think that the sampling uncertainty can be large. In other words, if you were to do another training/test split, you would get different results so the CI for accuracy or kappa on a single test set don't really reflect this sampling noise. If you were doing models in R, I would suggest that you do many training/test splits and look at the distributions of those metrics. Max ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.