Amber Dawn Nolder wrote 2014-05-28 23:16:
Hello, I am an R novice, and I am using the "partykit" package to create regression trees. I used the following to generate the trees: ctree(y~x1+x2+x3+x4,data=my_data,control=ctree_control(testtype = "Bonferroni", mincriterion = 0.90, minsplit = 12, minbucket = 4, majority = TRUE) I thought that "minbucket" set the minimum value for the sum of weights in each terminal node, and that each case weight is 1, unless otherwise specified. In which case, the sum of case weights in a node should equal the number of cases (n) in that node. However, I sometimes obtain a tree with a terminal node that contains fewer than 4 cases.
I do agree that the tree below looks suspicious. You may have found a bug.
But you didn't provide "commented, minimal, self-contained, reproducible code", i.e., we're missing your 'my_data' object, and therefore we cannot reproduce this easily. Can you please provide us with the output from 'dput(my_data)'?
My data set has a total of 36 cases. The dependent and all independent variables are continuous data. Variables x1 and x2 contain missing (NA) values.
I tried a few other data sets and there the results seem to come out OK (even after inducing NAs).
Could someone please explain why I am getting these results?
Probably. But you need to provide a reproducible example and the details obtained by 'sessionInfo()'.
As per the posting guide, since this is a contributed package you should first contact its maintainer (Torsten Hothorn, CC'd) and only post here if you get no reply. Did you try contacting Torsten?
Am I mistaken about the value of case weights or about the use of minbucket to restrict the size of a terminal node?
I don't think you're mistaken since '?ctree_control' says that "minbucket: the minimum sum of weights in a terminal node."
Henric
This is an example of the output: Model formula: y ~ x1 + x2 + x3 + x4 Fitted party: [1] root | [2] x4 <= 30: 0.927 (n = 17, err = 1.1) | [3] x4 > 30 | | [4] x2 <= 43: 0.472 (n = 8, err = 0.4) | | [5] x2 > 43 | | | [6] x3 <= 0.4: 0.282 (n = 3, err = 0.0) | | | [7] x3 > 0.4: 0.020 (n = 8, err = 0.0) Number of inner nodes: 3 Number of terminal nodes: 4 Many thanks! Amber Nolder Graduate Student Indiana University of Pennsylvania ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.