On Aug 5, 2011, at 12:53 PM, Paul Smith wrote:
On Fri, Aug 5, 2011 at 5:35 PM, David Winsemius <dwinsem...@comcast.net
> wrote:
I have just estimated this model:
-----------------------------------------------------------
Logistic Regression Model
lrm(formula = Y ~ X16, x = T, y = T)
Model Likelihood Discrimination Rank
Discrim.
Ratio Test Indexes
Indexes
Obs 82 LR chi2 5.58 R2 0.088 C
0.607
0 46 d.f. 1 g 0.488
Dxy 0.215
1 36 Pr(> chi2) 0.0182 gr 1.629
gamma 0.589
max |deriv| 9e-11 gp 0.107 tau-a
0.107
Brier 0.231
Coef S.E. Wald Z Pr(>|Z|)
Intercept -1.3218 0.5627 -2.35 0.0188
X16=1 1.3535 0.6166 2.20 0.0282
-----------------------------------------------------------
Analyzing the goodness of fit:
-----------------------------------------------------------
resid(model.lrm,'gof')
Sum of squared errors Expected value|H0 SD
1.890393e+01 1.890393e+01 6.073415e-16
Z P
-8.638125e+04 0.000000e+00
-----------------------------------------------------------
From the above calculated p-value (0.000000e+00), one should
discard
this model. However, there is something that is puzzling me: If
the
'Expected value|H0' is so coincidental with the 'Sum of squared
errors', why should one discard the model? I am certainly missing
something.
It's hard to tell what you are missing, since you have not
described your
reasoning at all. So I guess what is at error is your expectation
that we
would have drawn all of the unstated inferences that you draw when
offered
the output from lrm. (I certainly did not draw the inference that
"one
should discard the model".)
resid is a function designed for use with glm and lm models. Why
aren't
you
using residuals.lrm?
----------------------------------------------------------
residuals.lrm(model.lrm,'gof')
Sum of squared errors Expected value|H0 SD
1.890393e+01 1.890393e+01 6.073415e-16
Z P
-8.638125e+04 0.000000e+00
Great. Now please answer the more fundamental question. Why do you
think
this mean "discard the model"?
Before answering that, let me tell you
resid(model.lrm,'gof')
calls residuals.lrm() -- so both approaches produce the same results.
(See the examples given by ?residuals.lrm)
To answer your question, I invoke the reasoning given by Frank
Harrell at:
http://r.789695.n4.nabble.com/Hosmer-Lemeshow-goodness-of-fit-td3508127.html
He writes:
«The test in the rms package's residuals.lrm function is the le Cessie
- van Houwelingen - Copas - Hosmer unweighted sum of squares test for
global goodness of fit. Like all statistical tests, a large P-value
has no information other than there was not sufficient evidence to
reject the null hypothesis. Here the null hypothesis is that the true
probabilities are those specified by the model. »
How does that apply to your situation? You have a small (one might
even say infinitesimal) p-value.
From Harrell's argument does not follow that if the p-value is zero
one should reject the null hypothesis?
No, it doesn't follow at all, since that is not what he said. You are
committing a common logical error. If A then B does _not_ imply If Not-
A then Not-B.
Please, correct if it is not
correct what I say, and please direct me towards a way of establishing
the goodness of fit of my model.
You need to state your research objectives and describe the science in
your domain. They you need to describe your data gathering methods and
your analytic process. Then there might be a basis for further comment.
--
David Winsemius, MD
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.