Re: [R] Goodness of fit of binary logistic model

David Winsemius Fri, 05 Aug 2011 11:09:29 -0700


On Aug 5, 2011, at 12:53 PM, Paul Smith wrote:

On Fri, Aug 5, 2011 at 5:35 PM, David Winsemius <dwinsem...@comcast.net> wrote:

I have just estimated this model:
-----------------------------------------------------------
Logistic Regression Model

lrm(formula = Y ~ X16, x = T, y = T)
Model Likelihood Discrimination RankDiscrim.Ratio Test IndexesIndexes
Obs            82    LR chi2      5.58    R2       0.088    C
0.607
0 46 d.f. 1 g 0.488Dxy 0.2151 36 Pr(> chi2) 0.0182 gr 1.629gamma 0.589
max |deriv| 9e-11                         gp       0.107    tau-a
0.107
                                       Brier    0.231

       Coef    S.E.   Wald Z Pr(>|Z|)
Intercept -1.3218 0.5627 -2.35  0.0188
X16=1      1.3535 0.6166  2.20  0.0282
-----------------------------------------------------------

Analyzing the goodness of fit:

-----------------------------------------------------------
resid(model.lrm,'gof')
Sum of squared errors     Expected value|H0                    SD
      1.890393e+01          1.890393e+01          6.073415e-16
                 Z                     P
     -8.638125e+04          0.000000e+00
-----------------------------------------------------------
From the above calculated p-value (0.000000e+00), one shoulddiscard
this model. However, there is something that is puzzling me: Ifthe
'Expected value|H0' is so coincidental with the 'Sum of squared
errors', why should one discard the model? I am certainly missing
something.
It's hard to tell what you are missing, since you have notdescribed yourreasoning at all. So I guess what is at error is your expectationthat we
would have drawn all of the unstated inferences that you draw when
offered
the output from lrm. (I certainly did not draw the inference that"one
should discard the model".)
resid is a function designed for use with glm and lm models. Whyaren't
you
 using residuals.lrm?
----------------------------------------------------------
residuals.lrm(model.lrm,'gof')
Sum of squared errors     Expected value|H0                    SD
       1.890393e+01          1.890393e+01          6.073415e-16
                  Z                     P
      -8.638125e+04          0.000000e+00
Great. Now please answer the more fundamental question. Why do youthink
this mean "discard the model"?


Before answering that, let me tell you

resid(model.lrm,'gof')

calls residuals.lrm() -- so both approaches produce the same results.
(See the examples given by ?residuals.lrm)

To answer your question, I invoke the reasoning given by FrankHarrell at:


http://r.789695.n4.nabble.com/Hosmer-Lemeshow-goodness-of-fit-td3508127.html

He writes:

«The test in the rms package's residuals.lrm function is the le Cessie
- van Houwelingen - Copas - Hosmer unweighted sum of squares test for
global goodness of fit.  Like all statistical tests, a large P-value
has no information other than there was not sufficient evidence to
reject the null hypothesis.  Here the null hypothesis is that the true
probabilities are those specified by the model. »

How does that apply to your situation? You have a small (one mighteven say infinitesimal) p-value.

From Harrell's argument does not follow that if the p-value is zero

one should reject the null hypothesis?

No, it doesn't follow at all, since that is not what he said. You arecommitting a common logical error. If A then B does _not_ imply If Not-A then Not-B.

Please, correct if it is not
correct what I say, and please direct me towards a way of establishing
the goodness of fit of my model.

You need to state your research objectives and describe the science inyour domain. They you need to describe your data gathering methods andyour analytic process. Then there might be a basis for further comment.


--
David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Goodness of fit of binary logistic model

Reply via email to