Re: [R] Optimal Y>=q cutoff after logistic regression

David Winsemius Sun, 13 Feb 2011 21:47:03 -0800


On Feb 14, 2011, at 12:31 AM, Daniel Weitzenfeld wrote:

Hi,

I understand that dichotimization of the predicted probabilities after
logistic regression is philosophically questionable, throwing out
information, etc.

But I want to do it anyway.  I'd like to include as a measure of fit %
of observations correctly classified because it's measured in units
that non-statisticians can understand more easily  than area under the
ROC curve, Dxy, etc.

Am I right that there is an optimal Y>=q probability cutoff, at which
the True Positive Rate is high and the False Positive Rate is low?


Only if the data supports it.

Visually, it would be the elbow in the ROC curve, right?

If there is an "elbow", perhaps. The real answer is that you shouldthoughtfully consider the consequences of a wrong answer that the testis negative (False -) and those of a wrong answer that a test ispositive (False +) and then make a decision that properly balancesboth the costs sand the probabilities.

My reasoning is that even if you had a near-perfect model, you could
set a stupidly low (high) cutoff and have a higher false positive
(negative) rate than would be optimal.

I know the standard default or starting point is Y>=.5,


Huh... what is Y?

but if my
above reasoning is correct, there ought to be an optimal cutoff for a
given model.  Is there an easy way to determine that cutoff in R
without writing my own script to iterate through possible breakpoints
and calculating classification accuracy at each one?


There are packages that handle ROC analyses.


Thanks in advance.
-Dan

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Optimal Y>=q cutoff after logistic regression

Reply via email to