Jen_mp3 wrote:
So I have 2 sets of data - a training data set and a test data set. I've been
doing the analysis on the training data set and then using predict and
feeding the test data through that. There are 114 rows in the training data
and 117 in the test data and 1024 columns in both. It's actually the same
set of data split into two. The rows are made of 5 different numbers. They
do represent something but it would take too long to explain.

Your sample size is too small by a factor of perhaps 100 for simple data splitting to provide stable results. Then you have the problem of an improper scoring rule, i.e., one that when optimized gives the wrong answer.

Frank Harrell


I want to try and find a classification rule for the 5 numbers in the rows
based on the columns so I created a classification tree and plotted that and
then pruned it. My question is how do you print the misclassification rate
at each node on the actual diagram of the classification tree. I can't seem
to get it up there. In my notes it uses gmistext() but I have a feeling that
it's for Splus rather than R as gmistext() doesn.t work for me either.
Second question is when I try using the predict.tree to put the test data
into the tree and then plot it it comes up with a really weird and wrong
looking plot. Here is the code I'm using:
tree1 <- tree(row~.,data=train)
pruned.tree <- prune.tree(tree1, best = 5, method = "misclass")
predict.tree1 <- predict.tree(prune.tree, data = main)
plot(predict.tree);text(predict.tree)
I sort of don't get a classification tree, I get the x axis labelled 1, the
y axis labelled 2 and then about 4 small black rectangles scattered across
the plot. Thanks in Advance.


--
Frank E Harrell Jr   Professor and Chair           School of Medicine
                     Department of Biostatistics   Vanderbilt University

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to