On Aug 5, 2011, at 2:29 PM, Paul Smith wrote:
On Fri, Aug 5, 2011 at 7:07 PM, David Winsemius <dwinsem...@comcast.net
> wrote:
I have just estimated this model:
-----------------------------------------------------------
Logistic Regression Model
lrm(formula = Y ~ X16, x = T, y = T)
Model Likelihood Discrimination Rank
Discrim.
Ratio Test Indexes
Indexes
Obs 82 LR chi2 5.58 R2 0.088 C
0.607
0 46 d.f. 1 g 0.488 Dxy
0.215
1 36 Pr(> chi2) 0.0182 gr 1.629 gamma
0.589
max |deriv| 9e-11 gp 0.107
tau-a
0.107
Brier 0.231
Coef S.E. Wald Z Pr(>|Z|)
Intercept -1.3218 0.5627 -2.35 0.0188
X16=1 1.3535 0.6166 2.20 0.0282
-----------------------------------------------------------
Analyzing the goodness of fit:
-----------------------------------------------------------
resid(model.lrm,'gof')
Sum of squared errors Expected value|H0
SD
1.890393e+01 1.890393e+01 6.073415e-16
Z P
-8.638125e+04 0.000000e+00
-----------------------------------------------------------
From the above calculated p-value (0.000000e+00), one should
discard
this model. However, there is something that is puzzling me:
If the
'Expected value|H0' is so coincidental with the 'Sum of squared
errors', why should one discard the model? I am certainly
missing
something.
It's hard to tell what you are missing, since you have not
described
your
reasoning at all. So I guess what is at error is your
expectation that
we
would have drawn all of the unstated inferences that you draw
when
offered
the output from lrm. (I certainly did not draw the inference
that "one
should discard the model".)
resid is a function designed for use with glm and lm models.
Why aren't
you
using residuals.lrm?
----------------------------------------------------------
residuals.lrm(model.lrm,'gof')
Sum of squared errors Expected value|H0 SD
1.890393e+01 1.890393e+01 6.073415e-16
Z P
-8.638125e+04 0.000000e+00
Great. Now please answer the more fundamental question. Why do
you think
this mean "discard the model"?
Before answering that, let me tell you
resid(model.lrm,'gof')
calls residuals.lrm() -- so both approaches produce the same
results.
(See the examples given by ?residuals.lrm)
To answer your question, I invoke the reasoning given by Frank
Harrell at:
http://r.789695.n4.nabble.com/Hosmer-Lemeshow-goodness-of-fit-td3508127.html
He writes:
«The test in the rms package's residuals.lrm function is the le
Cessie
- van Houwelingen - Copas - Hosmer unweighted sum of squares test
for
global goodness of fit. Like all statistical tests, a large P-value
has no information other than there was not sufficient evidence to
reject the null hypothesis. Here the null hypothesis is that the
true
probabilities are those specified by the model. »
How does that apply to your situation? You have a small (one might
even say
infinitesimal) p-value.
From Harrell's argument does not follow that if the p-value is zero
one should reject the null hypothesis?
No, it doesn't follow at all, since that is not what he said. You are
committing a common logical error. If A then B does _not_ imply If
Not-A
then Not-B.
Please, correct if it is not
correct what I say, and please direct me towards a way of
establishing
the goodness of fit of my model.
You need to state your research objectives and describe the science
in your
domain. They you need to describe your data gathering methods and
your
analytic process. Then there might be a basis for further comment.
I will try to read the original paper where this goodness of fit test
is proposed to clarify my doubts. In any case, in the paper
@article{barnes2008model,
title={A model to predict outcomes for endovascular aneurysm repair
using preoperative variables},
author={Barnes, M. and Boult, M. and Maddern, G. and Fitridge, R.},
journal={European Journal of Vascular and Endovascular Surgery},
volume={35},
number={5},
pages={571--579},
year={2008},
publisher={Elsevier}
}
it is written:
«Table 5 lists the results of the global goodness of fit test
for each outcome model using the le Cessie-van Houwe-
lingen-Copas-Hosmer unweighted sum of squares test.
In the table a ‘good’ fit is indicated by large p-values
( p > 0.05). Lack of fit is indicated by low p-values
( p < 0.05). All p-values indicate that the outcome models
have reasonable fit, with the exception of the outcome
model for conversion to open repairs ( p ¼ 0.04). The
low p-value suggests a lack of fit and it may be worth
refining the model for conversion to open repair.»
In short, according to these authors, low p-values seem to suggest
lack of fit.
Paul
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.