Amber Dawn Nolder wrote 2014-05-28 23:16:

    Hello,
    I am an R novice, and I am using the "partykit" package to create
    regression trees. I used the following to generate the trees:
    ctree(y~x1+x2+x3+x4,data=my_data,control=ctree_control(testtype =
    "Bonferroni", mincriterion = 0.90, minsplit = 12, minbucket = 4,
    majority = TRUE)
    I thought that "minbucket" set the minimum value for the sum of weights
    in each terminal node, and that each case weight is 1, unless otherwise
    specified. In which case, the sum of case weights in a node should equal the
    number of cases (n) in that node. However, I  sometimes obtain a tree with
    a terminal node that contains fewer than 4 cases.

I do agree that the tree below looks suspicious. You may have found a bug.

But you didn't provide "commented, minimal, self-contained, reproducible code", i.e., we're missing your 'my_data' object, and therefore we cannot reproduce this easily. Can you please provide us with the output from 'dput(my_data)'?

    My data set has a total of 36 cases. The dependent and all independent
    variables are continuous data. Variables x1 and x2 contain missing (NA)
    values.

I tried a few other data sets and there the results seem to come out OK (even after inducing NAs).

    Could someone please explain why I am getting these results?

Probably. But you need to provide a reproducible example and the details obtained by 'sessionInfo()'.

As per the posting guide, since this is a contributed package you should first contact its maintainer (Torsten Hothorn, CC'd) and only post here if you get no reply. Did you try contacting Torsten?

    Am I  mistaken about the value of case weights or about the use of minbucket
    to restrict the size of a terminal node?

I don't think you're mistaken since '?ctree_control' says that "minbucket: the minimum sum of weights in a terminal node."


Henric



    This is an example of the output:
    Model formula:
    y ~ x1 + x2 + x3 + x4
    Fitted party:
    [1] root
    |   [2] x4 <= 30: 0.927 (n = 17, err = 1.1)
    |   [3] x4 > 30
    |   |   [4] x2 <= 43: 0.472 (n = 8, err = 0.4)
    |   |   [5] x2 > 43
    |   |   |   [6] x3 <= 0.4: 0.282 (n = 3, err = 0.0)
    |   |   |   [7] x3 > 0.4: 0.020 (n = 8, err = 0.0)
    Number of inner nodes:    3
    Number of terminal nodes: 4
    Many thanks!
    Amber Nolder
    Graduate Student
    Indiana University of Pennsylvania
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to