Re: [R] Goodness of fit of binary logistic model

David Winsemius Fri, 05 Aug 2011 11:35:27 -0700


On Aug 5, 2011, at 2:29 PM, Paul Smith wrote:

On Fri, Aug 5, 2011 at 7:07 PM, David Winsemius <dwinsem...@comcast.net> wrote:
I have just estimated this model:
-----------------------------------------------------------
Logistic Regression Model

lrm(formula = Y ~ X16, x = T, y = T)
Model Likelihood Discrimination RankDiscrim.Ratio Test IndexesIndexes
Obs            82    LR chi2      5.58    R2       0.088    C
0.607
0             46    d.f.            1    g        0.488    Dxy
0.215
1             36    Pr(> chi2) 0.0182    gr       1.629    gamma
0.589
max |deriv| 9e-11 gp 0.107tau-a
0.107
                                      Brier    0.231

      Coef    S.E.   Wald Z Pr(>|Z|)
Intercept -1.3218 0.5627 -2.35  0.0188
X16=1      1.3535 0.6166  2.20  0.0282
-----------------------------------------------------------

Analyzing the goodness of fit:

-----------------------------------------------------------
resid(model.lrm,'gof')
Sum of squared errors Expected value|H0SD
     1.890393e+01          1.890393e+01          6.073415e-16
                Z                     P
    -8.638125e+04          0.000000e+00
-----------------------------------------------------------
From the above calculated p-value (0.000000e+00), one shoulddiscard
this model. However, there is something that is puzzling me:If the
'Expected value|H0' is so coincidental with the 'Sum of squared
errors', why should one discard the model? I am certainlymissing
something.
It's hard to tell what you are missing, since you have notdescribed
your
reasoning at all. So I guess what is at error is yourexpectation that
we
would have drawn all of the unstated inferences that you drawwhen
offered
the output from lrm. (I certainly did not draw the inferencethat "one
should discard the model".)
resid is a function designed for use with glm and lm models.Why aren't
you
 using residuals.lrm?
----------------------------------------------------------
residuals.lrm(model.lrm,'gof')
Sum of squared errors     Expected value|H0                    SD
      1.890393e+01          1.890393e+01          6.073415e-16
                 Z                     P
     -8.638125e+04          0.000000e+00
Great. Now please answer the more fundamental question. Why doyou think
this mean "discard the model"?
Before answering that, let me tell you

resid(model.lrm,'gof')
calls residuals.lrm() -- so both approaches produce the sameresults.
(See the examples given by ?residuals.lrm)
To answer your question, I invoke the reasoning given by FrankHarrell at:
http://r.789695.n4.nabble.com/Hosmer-Lemeshow-goodness-of-fit-td3508127.html

He writes:
«The test in the rms package's residuals.lrm function is the leCessie- van Houwelingen - Copas - Hosmer unweighted sum of squares testfor
global goodness of fit.  Like all statistical tests, a large P-value
has no information other than there was not sufficient evidence to
reject the null hypothesis. Here the null hypothesis is that thetrue
probabilities are those specified by the model. »
How does that apply to your situation? You have a small (one mighteven say
infinitesimal) p-value.
From Harrell's argument does not follow that if the p-value is zero
one should reject the null hypothesis?
No, it doesn't follow at all, since that is not what he said. You are
committing a common logical error. If A then B does _not_ imply IfNot-A
then Not-B.
Please, correct if it is not
correct what I say, and please direct me towards a way ofestablishing
the goodness of fit of my model.
You need to state your research objectives and describe the sciencein yourdomain. They you need to describe your data gathering methods andyour
analytic process. Then there might be a basis for further comment.
I will try to read the original paper where this goodness of fit test
is proposed to clarify my doubts. In any case, in the paper

@article{barnes2008model,
 title={A model to predict outcomes for endovascular aneurysm repair
using preoperative variables},
 author={Barnes, M. and Boult, M. and Maddern, G. and Fitridge, R.},
 journal={European Journal of Vascular and Endovascular Surgery},
 volume={35},
 number={5},
 pages={571--579},
 year={2008},
 publisher={Elsevier}
}

it is written:

«Table 5 lists the results of the global goodness of ﬁt test
for each outcome model using the le Cessie-van Houwe-
lingen-Copas-Hosmer unweighted sum of squares test.
In the table a ‘good’ ﬁt is indicated by large p-values
( p > 0.05). Lack of ﬁt is indicated by low p-values
( p < 0.05). All p-values indicate that the outcome models
have reasonable ﬁt, with the exception of the outcome
model for conversion to open repairs ( p ¼ 0.04). The
low p-value suggests a lack of ﬁt and it may be worth
reﬁning the model for conversion to open repair.»
In short, according to these authors, low p-values seem to suggestlack of fit.
Paul

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Goodness of fit of binary logistic model

Reply via email to