Hi all, I'm currently using the 'rpart' function to run some regression analysis and I am at the point where I wish to prune my overfitted trees. Having read the documentation I understand that to do this requires the use of the complexity parameter. My question is how to go about choosing the correct complexity parameter for my tree? In some places (http://www.statmethods.net/advstats/cart.html) I have read that it is best to select the complexity parameter which minimises the cross-validated (x) error of the model, but elsewhere I have read that the optimum cp is the first value on the left above the '1+SE' line of the complexity paramter plot.
I was hoping someone might be able to clarify this minor issue for me. Many thanks, Andy _________________________________________________________________ Save time by using Hotmail to access your other email accounts. [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.