I think you want something like this: optimal.nSplit = rep(NA, 50) # This will hold the result for (run in 1:50) { fit1 = rpart(...) cpTable = fit1$cptable bestRow = which.min(cpTable[, "xerror"]); optimal.nSplit[run] = cpTable[bestRow, "nsplit"] }
In any case, look at ?rpart ?printcp ?rpart.object Peter On Tue, Oct 12, 2010 at 4:50 PM, Andrew Halford <andrew.half...@gmail.com> wrote: > Hi All, > > I have to say upfront that I am a complete neophyte when it comes to > programming. Nevertheless I enjoy the challenge of using R because of its > incredible statistical resources. > > My problem is this .........I am running a regression tree analysis using > "rpart" and I need to run the calculation repeatedly (say n=50 times) to > obtain a distribution of results from which I will pick the median one to > represent the most parsimonious tree size. Unfortunately rpart does not > contain this ability so it will have to be coded for. > > Could anyone help me with this? I have provided the code (and relevant > output) for the analysis I am running. I need to run it n=50 times and from > each output pick the appropriate tree size and post it to a datafile where I > can then look at the frequency distribution of tree sizes. > > Here is the code and output from a single run > >> fit1 <- rpart(CHAB~.,data=chabun, method="anova", > control=rpart.control(minsplit=10, cp=0.01, xval=10)) >> printcp(fit1) > > Regression tree: > rpart(formula = CHAB ~ ., data = chabun, method = "anova", control = > rpart.control(minsplit = 10, > cp = 0.01, xval = 10)) > Variables actually used in tree construction: > [1] EXP LAT POC RUG > Root node error: 35904/33 = 1088 > n= 33 > CP nsplit rel error xerror xstd > 1 0.539806 0 1.00000 1.0337 0.41238 > 2 0.050516 1 0.46019 1.2149 0.38787 > 3 0.016788 2 0.40968 1.2719 0.41280 > 4 0.010221 3 0.39289 1.1852 0.38300 > 5 0.010000 4 0.38267 1.1740 0.38333 > > Each time I re-run the model I will get a slightly different output. I want > to extract the nsplit number corresponding to the lowest xerror for each run > of the model (in this case it is for nsplit = 0) over 50 runs and then look > at the distribution of nsplits after 50 runs. > > Any help appreciated. > > > Andy > > > -- > Andrew Halford > Associate Researcher > Marine Laboratory > University of Guam > Ph: +1 671 734 2948 > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.