Re: [R] party with mob - parameter estimates not significant in terminal nodes

Achim Zeileis Tue, 05 Oct 2010 06:47:56 -0700

Tudor:

I successfully model-based partitioned several datasets through the useof mob from the party package (thanks Achim et al. once again !!!). Attimes, however, the partitioning leads to terminal nodes in which theparameter estimates of the model are not significant (although the splitpoints and in general the proposed segmentation both seem reasonable).


There are two aspects to this:

(1) The algorithm just determines whether the coefficients between twochild nodes are significantly different. It may or may not be the casethat they are significantly different from zero within each node. As anexample: You may have a tree with a single split and two child nodes. Inthe first child node, you have a highly significant parameter value, butin the second node, you have no significant value.

(2) Due to partitioning, it may be the case that not all parameters of themodel are identified in all child nodes. Currently, within mob(), this isnot systematically checked. In particular, you may have (quasi-)completeseparation in binomial GLMs if a child node is particularly "pure". Thisseems to have happened in your example below. From a machine learningpoint of view, this is not a bad thing, you just need to interpret itcorrectly.

As I do not seem to be able to come up with an intuitiveexplanation/interpretation for this (other than that the partitioningmodel may be appropriate for parts of the dataset(s)), I wonder if anyof you could share your thoughts on this topic with me. For yourconvenience I attached a relevant set of results below.

I guess that the variable "P" is binary and that when you cross-tabulateit with the response for Node 3, that there are zeros in the contingencytable. I.e. you may have a perfect split in that one sub-sample.


hth,
Z


$`2`

Call:
NULL

Deviance Residuals:
                Min                   1Q               Median
3Q                  Max
-2.1613499829328759  -0.1182099512510448   0.0000000000000000
0.1199438072333263   1.7963628663418680

Coefficients:
                       Estimate          Std. Error  z value
Pr(>|z|)
(Intercept) 38.6736721222665096  5.1182299436934375  7.55606
0.000000000000041545 ***
P           -3.8195232976021787  0.5042297985419135 -7.57497
0.000000000000035922 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 407.0806101624161  on 293  degrees of freedom
Residual deviance: 132.0087256781199  on 292  degrees of freedom
AIC: 136.0087256781199

Number of Fisher Scoring iterations: 7


$`3`

Call:
NULL

Deviance Residuals:
                    Min                       1Q                   Median
3Q                      Max
-0.00009134433923085110   0.00000000000000000000   0.00000000000000000000
0.00000000000000000000   0.00009204763394325872

Coefficients:
                        Estimate           Std. Error  z value Pr(>|z|)
(Intercept)   1755.7555999083327 601505.6700290179579  0.00292  0.99767
P             -181.3394660743267  62127.5207770660636 -0.00292  0.99767

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 94.20918454290385568583588  on 67  degrees of freedom
Residual deviance:  0.00000001683616309495537  on 66  degrees of freedom
AIC: 4.000000016836163

Number of Fisher Scoring iterations: 25

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] party with mob - parameter estimates not significant in terminal nodes

Reply via email to