On Wed, 2011-05-11 at 16:11 -0700, Shi, Tao wrote:
> Hi all,
> 
> I found that the two different versions of "survival" packages, namely 2.36-5 
> vs. 2.36-8 or later, give different results for coxph function.  Please see 
> below and the data is attached.  The second one was done on Linux, but 
> Windows 
> gave the same results.  Could you please let me know which one I should trust?
> 
> Thanks,

 In your case, neither.  Your data set has 22 events and 17 predictors;
the rule of thumb for a reliable Cox model is 10-20 events per predictor
which implies no more than 2 for your data set.  As a result, the
coefficients of your model have very wide confidence intervals, the coef
for Male for instance has se of 3.26, meaning the CI goes from 1/26 to
26 times the estimate; i.e., there is no biological meaning to the
estimate.

  Nevertheless, why did coxph give a different answer?  The later
version 2.36-9 failed to converge (20 iterations) with a final
log-likelihood of -19.94, the earlier code converges in 10 iterations to
-19.91.  In version 2.36-6 an extra check was put into the maximizer for
coxph in response to an exceptional data set which caused the routine to
fail due to overflow of the exp function; the Newton-Raphson iteration
algorithm had made a terrible guess in it's iteration path, which can
happen with all NR based search methods.  
   I put a limit on the size the linear predictor in the Cox model of
21.  The basic argument is that exp(linear-predictor) = relative risk
for a subject, and that there is not much biological meaning for risks
to be less than exp(-21) ~ 1/(population of the earh).  There is more to
the reasoning, interested parties should look at the comments in
src/coxsafe.c, a 5 line routine with 25 lines of discussion.  I will
happily accept input the "best" value for the constant.

   I never expected to see a data set with both convergence of the LL
and linear predictors larger than +-15.  Looking at the fit (older code)
> round(fit2$linear.predictor, 2)
 [1]   2.26   0.89   4.96 -19.09 -12.10   1.39   2.82   3.10
 [9]  18.57 -25.25  22.94   8.75   5.52 -27.64  14.88 -23.41
[17]  13.70 -28.45  -1.84  10.04  12.62   2.54   6.33  -8.76
[25]   9.68   4.39   2.92   3.51   6.02 -17.24   5.97

This says that, if the model is to be believed, you have several near
immortals in the data set. (Everyone else on earth will perish first).

Terry Therneau

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to