On Sat, 19 Mar 2011, Penny B wrote:

I am trying to find out what type of sampling scheme is used to select the 10
subsets in 10-fold cross-validation process used in rpart to choose the best
tree. Is it simple random sampling? Is there any documentation available on
this?

Not SRS (and least in its conventional meaning), as it is partitioning: the 10 folds are disjoint.

Note that this happens in two places, in rpart() and in xpred.rpart(), but the (default) method is the same. I presume you asked about the first, but it wasn't clear.

There is a lot of documentation on the meaning of '10-fold cross-validation', e.g. in my 1996 book. There are a few slightly different ways to do it, and you can read the rpart sources if you want to know the details.

--
Brian D. Ripley,                  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to