Katie N wrote:
Hi,
I am trying to use CART to find an ideal cut-off value for a simple
diagnostic test (ie when the test score is above x, diagnose the condition).
When I put in the model
fit=rpart(outcome ~ predictor1(TB144), method="class", data=data8)
sometimes it gives me a tree with multiple nodes for the same predictor (see
below for example of tree with 1 or multiple nodes). Is there a way to tell
it to make only 1 node? Or is it safe to assume that the cut-off value on
the primary node is the ideal cut-off?
Thanks!
Katie
http://n4.nabble.com/file/n964970/smartDNA%2BCART%2B-%2BTB144n.jpg
http://n4.nabble.com/file/n964970/smartDNA%2BCART%2B-%2BTB122n.jpg
Katie,
Do note that the strategy you are using is inconsistent with decision
theory. Optimal decisions have to condition on everything you know
about a single patient, and do not ask the question "to what group does
this patient belong?". For example, we estimate something given the
patient's age is 20 instead of given that her age is less than 60.
That's why logistic regression is used so frequently to estimate
probabilities of disease. Any cutoff that must be used has to be on the
predicted probability scale in order to get an optimum decision, and
that cutoff must be specified by the provider of the utility function.
Even then the cutoff is not fully trusted, e.g., a physician may order
another test as the last minute when the probability of disease is in a
gray zone.
Frank
--
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.