On Fri, Aug 5, 2011 at 7:07 PM, David Winsemius <dwinsem...@comcast.net> wrote: >>>>>> I have just estimated this model: >>>>>> ----------------------------------------------------------- >>>>>> Logistic Regression Model >>>>>> >>>>>> lrm(formula = Y ~ X16, x = T, y = T) >>>>>> >>>>>> Model Likelihood Discrimination Rank Discrim. >>>>>> Ratio Test Indexes Indexes >>>>>> >>>>>> Obs 82 LR chi2 5.58 R2 0.088 C >>>>>> 0.607 >>>>>> 0 46 d.f. 1 g 0.488 Dxy >>>>>> 0.215 >>>>>> 1 36 Pr(> chi2) 0.0182 gr 1.629 gamma >>>>>> 0.589 >>>>>> max |deriv| 9e-11 gp 0.107 tau-a >>>>>> 0.107 >>>>>> Brier 0.231 >>>>>> >>>>>> Coef S.E. Wald Z Pr(>|Z|) >>>>>> Intercept -1.3218 0.5627 -2.35 0.0188 >>>>>> X16=1 1.3535 0.6166 2.20 0.0282 >>>>>> ----------------------------------------------------------- >>>>>> >>>>>> Analyzing the goodness of fit: >>>>>> >>>>>> ----------------------------------------------------------- >>>>>>> >>>>>>> resid(model.lrm,'gof') >>>>>> >>>>>> Sum of squared errors Expected value|H0 SD >>>>>> 1.890393e+01 1.890393e+01 6.073415e-16 >>>>>> Z P >>>>>> -8.638125e+04 0.000000e+00 >>>>>> ----------------------------------------------------------- >>>>>> >>>>>>> From the above calculated p-value (0.000000e+00), one should discard >>>>>> >>>>>> this model. However, there is something that is puzzling me: If the >>>>>> 'Expected value|H0' is so coincidental with the 'Sum of squared >>>>>> errors', why should one discard the model? I am certainly missing >>>>>> something. >>>>> >>>>> It's hard to tell what you are missing, since you have not described >>>>> your >>>>> reasoning at all. So I guess what is at error is your expectation that >>>>> we >>>>> would have drawn all of the unstated inferences that you draw when >>>>> offered >>>>> the output from lrm. (I certainly did not draw the inference that "one >>>>> should discard the model".) >>>>> >>>>> resid is a function designed for use with glm and lm models. Why aren't >>>>> you >>>>> using residuals.lrm? >>>> >>>> ---------------------------------------------------------- >>>>> >>>>> residuals.lrm(model.lrm,'gof') >>>> >>>> Sum of squared errors Expected value|H0 SD >>>> 1.890393e+01 1.890393e+01 6.073415e-16 >>>> Z P >>>> -8.638125e+04 0.000000e+00 >>> >>> Great. Now please answer the more fundamental question. Why do you think >>> this mean "discard the model"? >> >> Before answering that, let me tell you >> >> resid(model.lrm,'gof') >> >> calls residuals.lrm() -- so both approaches produce the same results. >> (See the examples given by ?residuals.lrm) >> >> To answer your question, I invoke the reasoning given by Frank Harrell at: >> >> >> http://r.789695.n4.nabble.com/Hosmer-Lemeshow-goodness-of-fit-td3508127.html >> >> He writes: >> >> «The test in the rms package's residuals.lrm function is the le Cessie >> - van Houwelingen - Copas - Hosmer unweighted sum of squares test for >> global goodness of fit. Like all statistical tests, a large P-value >> has no information other than there was not sufficient evidence to >> reject the null hypothesis. Here the null hypothesis is that the true >> probabilities are those specified by the model. » >> > > How does that apply to your situation? You have a small (one might even say > infinitesimal) p-value. > > >>> From Harrell's argument does not follow that if the p-value is zero >> >> one should reject the null hypothesis? > > No, it doesn't follow at all, since that is not what he said. You are > committing a common logical error. If A then B does _not_ imply If Not-A > then Not-B. > >> Please, correct if it is not >> correct what I say, and please direct me towards a way of establishing >> the goodness of fit of my model. > > You need to state your research objectives and describe the science in your > domain. They you need to describe your data gathering methods and your > analytic process. Then there might be a basis for further comment.
I will try to read the original paper where this goodness of fit test is proposed to clarify my doubts. In any case, in the paper @article{barnes2008model, title={A model to predict outcomes for endovascular aneurysm repair using preoperative variables}, author={Barnes, M. and Boult, M. and Maddern, G. and Fitridge, R.}, journal={European Journal of Vascular and Endovascular Surgery}, volume={35}, number={5}, pages={571--579}, year={2008}, publisher={Elsevier} } it is written: «Table 5 lists the results of the global goodness of fit test for each outcome model using the le Cessie-van Houwe- lingen-Copas-Hosmer unweighted sum of squares test. In the table a ‘good’ fit is indicated by large p-values ( p > 0.05). Lack of fit is indicated by low p-values ( p < 0.05). All p-values indicate that the outcome models have reasonable fit, with the exception of the outcome model for conversion to open repairs ( p ¼ 0.04). The low p-value suggests a lack of fit and it may be worth refining the model for conversion to open repair.» In short, according to these authors, low p-values seem to suggest lack of fit. Paul ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.