Hi again, Looking more into test statistics i realized that maybe i can use the power.prop.test to see if the difference between the 2 accuracies are zero or not. Do you have any comments about that? Also, should i considered kappa statistics also a kind of proportion and use the same test? If this does not violate any important hypothesis then .... power.prop.test(n = 146, p1 = 0.7877, p2 = 0.8014, strict = TRUE) Two-sample comparison of proportions power calculation n = 146 p1 = 0.7877 p2 = 0.8014 sig.level = 0.05 power = 0.0596356 alternative = two.sided NOTE: n is number in *each* group
which just tells that the difference in accuracies are barely different .... since the p.value = 0.06> 0.05 For Kappa statistics it will be: power.prop.test(n = 146, p1 = 0.3675, p2 = 0.4315, strict = TRUE) Two-sample comparison of proportions power calculation n = 146 p1 = 0.3675 p2 = 0.4315 sig.level = 0.05 power = 0.1999816 alternative = two.sided NOTE: n is number in *each* group Any comments are really appreciated, Monica ---------------------------------------- > From: pisican...@hotmail.com > To: r-help@r-project.org > CC: max.k...@pfizer.com > Subject: [R] statistical significance of accuracy increase in classification > Date: Tue, 24 Feb 2009 16:22:41 +0000 > > > Hi everyone, > > I would like to test for the statistical significance(for what it worth ...) > in increasing classification accuracy and kappa statistics from different > land classifications. The classifications were done using other software > (like eCognition and See5), but the results were "sampled" at locations where > i have the "reference" class known. So using package "caret" i did the > confusion matrix. For now i am interested in the overall results which give > the overall classification accuracy and kappa statistics among others. > Depending which classification i test, i have some small increase inaccuracy > and a little larger increase in kappa statistics. I wonder if there is a way > to do a statistical significance test for the accuracy and kappa increase > between the 2 classifications. > > Data example and some code: > > library(caret) > > ref <- c(15, 13, 13, 13, 13, 15, 14, 14, 14, 15, 13, 13, 13, 15, 13, 13, 13, > 15, 13, 13, 13, 13, 13, 13, 13,13, 14, 13, 13, 13, 13, 13, 13, 13, 15, 13, > 13, 15, 13, 15, 13, 13, 15, 13, 13, 13, 13, 13, 13, 13,13, 13, 13, 13, 13, > 15, 13, 13, 13, 13, 13, 13, 15, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, > 13,13, 14, 13, 13, 13, 13, 13, 14, 14, 15, 15, 13, 13, 13, 13, 13, 15, 13, > 13, 13, 13, 13, 13, 13, 13,13, 13, 14, 13, 13, 13, 13, 13, 13, 13, 13, 13, > 13, 13, 13, 13, 13, 15, 13, 13, 13, 13, 13, 13, 13,13, 13, 13, 13, 13, 13, > 13, 14, 13, 13, 13, 13, 13, 13, 15, 13, 13, 13, 13, 13, 13) > > class1 <- c(14, 14, 13, 13, 13, 15, 13, 14, 15, 14, 14, 13, 14, 13, 13, 13, > 13, 13, 13, 13, 13, 13, 13, 14, 13,13, 13, 13, 13, 13, 13, 13, 13, 13, 15, > 13, 14, 13, 13, 14, 13, 13, 15, 13, 13, 13, 13, 13, 13, 13,13, 13, 15, 21, > 13, 15, 13, 21, 13, 13, 14, 13, 15, 13, 15, 13, 13, 14, 13, 13, 13, 13, 13, > 13, 13,13, 14, 14, 13, 13, 13, 13, 15, 15, 15, 15, 13, 13, 13, 13, 13, 5, 13, > 15, 13, 13, 13, 13, 13, 13,15, 13, 15, 14, 13, 13, 13, 13, 13, 13, 13, 13, > 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13,13, 13, 13, 13, 13, 13, > 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13) > > class2 <- c(14, 15, 13, 13, 13, 15, 13, 14, 15, 15, 14, 13, 14, 13, 13, 13, > 13, 13, 13, 13, 13, 13, 13, 14, 13,13, 13, 13, 13, 13, 13, 13, 13, 13, 15, > 13, 14, 13, 13, 15, 13, 13, 15, 14, 13, 13, 13, 13, 13, 13,13, 13, 15, 13, > 13, 15, 13, 21, 13, 13, 13, 13, 15, 13, 15, 15, 13, 14, 13, 13, 13, 13, 13, > 13, 15,13, 14, 14, 13, 13, 13, 13, 15, 14, 15, 15, 13, 14, 13, 13, 13, 15, > 13, 15, 13, 13, 13, 13, 13, 13,15, 13, 15, 14, 13, 13, 13, 13, 13, 13, 13, > 13, 13, 13, 13, 13, 13, 22, 13, 13, 13, 13, 13, 13, 13,13, 13, 13, 13, 13, > 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13) > > ref1 <- factor(ref, levels = c(5, 13, 14, 15, 21, 22)) > pred1 <- factor(class1, levels = c(5, 13, 14, 15, 21, 22)) > pred2 <- factor(class2, levels = c(5, 13, 14, 15, 21, 22)) > > t1 <- table(pred1, ref1) > t2 <- table(pred2, ref1) > > cm1 <- confusionMatrix(t1) > cm1$overall > > cm2 <- confusionMatrix(t2) > cm2$overall > > As you see the increase in accuracy is very small, but the increase in kappa > is a little bit more substantial. Is this increase statistical significant? > > Thanks for any help, > > Monica > _________________________________________________________________ > http://windowslive.com/howitworks?ocid=TXT_TAGLM_WL_t2_hm_justgotbetter_howitworks_022009 _________________________________________________________________ It’s the same Hotmail®. If by “same” you mean up to 70% faster. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.