[R] pruning trees using rpart

Tom Cattaert Wed, 17 Dec 2008 01:41:56 -0800

Hi,

I am using the packages tree and rpart to build a classification tree to
predict a 0/1 outcome. The package rpart has the advantage that the function
plotcp gives a visual representation of the cross-validation results with a
horizontal line indicating the 1 standard error rule, i.e. the
recommendation to select the most parsimonious model (the smallest tree)
whose error is not more than one standard error above the error of the best
model.


However, in the rpart package I am not getting trees of all sizes but for
example three sizes are 1,2,5 in one example I am working with, while with
cv.tree in package tree it gives 1,2,3,4,5 like I would guess it should
(weakest link pruning successively collapses the internal nodes that
contrubute the least). What is the reason for this?

A second problem I am having in both packages is that the cross-validation
results are highly variable between different runs of the programs. This is
not unexpected as cross-validations means that the dataset is randomly
divided in 10 equal subsets, which can be done in a lot of different ways.
One then hopes that the results do not depend on this very much, but I
observed they do often. Should one then do this many times, e.g. 100, each
time select the model using the 1 standard error rule, and in the end count
which model got selected most often? Or rather do it many times and average
the means and standard errors of the prediction error? Or does a very high
variability in cross-validation results mean that the dataset is too small
to reach conclusions?

Kind regards and thanks for your help,
Tom

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] pruning trees using rpart

Reply via email to