If two variables have exactly the same split importance, then rpart will use 
the one that was first in the model statement.  So if
        rpart(group ~ age + height + weight + sex)
and at some split point both age and weight gave a split with 20 correct and 9 
incorrect, then age would be used to split at that node.

  Even though the error of the age and weight splits are the same, the set of 9 
subjects that were incorrect may be different, i.e., they don't send exactly 
the 
same observations to the left and the right.  Thus, the rest of the tree from 
that point on may be different, giving a different fit.
  
  For continuous y this rarely happens -- that two splits have exactly the same 
R^2 -- but it is not uncommon in classification problems.  
  
        Terry Therneau

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to