Dieter Menne wrote:
Maithili Shiva <maithili_shiva <at> yahoo.com> writes:
I havd main sample of 42500 clentes and
based on their status as regards to defaulted / non - defaulted, I have
genereted the probability of default.
I have a hold out sample of 5000 clients. I have calculated (1) No of
correctly classified goods Gg, (2) No of
correcly classified Bads Bg and also (3) number of wrongly classified bads
(Gb) and (4) number of wrongly
classified goods (Bg).
The simple and wrong answer is to use these data directly to compute sensitivity
(fraction of hits). This measure is useless, but I encounter it often in medical
publications.
Exactly. Using classification accuracy, sensitivity, specificity means
that you are not using the model's predicted probabilities in a
reasonable or powerful way. Credit scoring models need to demonstrate
absolute calibration accuracy.
Frank
You can get a more reasonable answer by using cross-validation. Check, for
example, Frank Harrell's
http://biostat.mc.vanderbilt.edu/twiki/pub/Main/RmS/logistic.val.pdf
Dieter
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.