Max,
 
thanks for the reply. Yes, the models are done outside R (i will see what i can 
do to run some of them inside R in the future ....) and the sampling is 
extremely skewed. But we use as truth or reference results from a field 
exercise where people actually went and gave detailed description of the 
locations visited. This was very much depended on accessibility of the site, 
.... which in majority is not. Unfortunately when results get reported to 
managers .... they do care about accuracy for example, but less about CI .... 
and even less about the skewed sampling .... unless i can prove that this gives 
unacceptable results. 
 
Do you know about any good reference that discusses kappa for classification 
and maybe CI for kappa???
 
Thanks again for your input,
 
Monica 

> Date: Wed, 25 Feb 2009 09:01:23 -0500
> Subject: Re: [R] statistical significance of accuracy increase in 
> classification
> From: mxk...@gmail.com
> To: pisican...@hotmail.com
> CC: r-help@r-project.org
> 
> Monica,
> 
> I have a few thoughts.
> 
> - (I believe) it is usually better to put confidence in these metrics
> instead of relying on p-values. The intervals will allow you to make
> inferential statements and give you a way of characterizing the
> uncertainty in the estimates. You've seen how to do this with
> accuracy. For Kappa, there is probably an analytical formula for a CI,
> but I don;t know that it is in R. I would use the bootstrap (bia the
> boot or bootstrap package) to get intervals for kappa.
> 
> - It sounds like some of the models were generated outside of R. I
> think that the sampling uncertainty can be large. In other words, if
> you were to do another training/test split, you would get different
> results so the CI for accuracy or kappa on a single test set don't
> really reflect this sampling noise. If you were doing models in R, I
> would suggest that you do many training/test splits and look at the
> distributions of those metrics.
> 
> 
> Max
_________________________________________________________________


ore_022009
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to