Katie N wrote:
Hi,
I am trying to use CART to find an ideal cut-off value for a simple
diagnostic test (ie when the test score is above x, diagnose the condition). When I put in the model
fit=rpart(outcome ~ predictor1(TB144), method="class", data=data8)

sometimes it gives me a tree with multiple nodes for the same predictor (see
below for example of tree with 1 or multiple nodes).  Is there a way to tell
it to make only 1 node?  Or is it safe to assume that the cut-off value on
the primary node is the ideal cut-off?

Thanks!
Katie

http://n4.nabble.com/file/n964970/smartDNA%2BCART%2B-%2BTB144n.jpg http://n4.nabble.com/file/n964970/smartDNA%2BCART%2B-%2BTB122n.jpg


Katie,

Do note that the strategy you are using is inconsistent with decision theory. Optimal decisions have to condition on everything you know about a single patient, and do not ask the question "to what group does this patient belong?". For example, we estimate something given the patient's age is 20 instead of given that her age is less than 60. That's why logistic regression is used so frequently to estimate probabilities of disease. Any cutoff that must be used has to be on the predicted probability scale in order to get an optimum decision, and that cutoff must be specified by the provider of the utility function. Even then the cutoff is not fully trusted, e.g., a physician may order another test as the last minute when the probability of disease is in a gray zone.

Frank
--
Frank E Harrell Jr   Professor and Chair           School of Medicine
                     Department of Biostatistics   Vanderbilt University

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to