Thanks for taking the time to look at this issue with changes in the survival 
package. 

There are warnings (different ones) with any of the versions mentioned - so it 
is difficult to tell what to not ignore; -). In fact it is somewhat central and 
essential to the surrounding text 's argument that the matrices are expected to 
be singular, but the rank is used as an indicator to explore interaction - and 
lacking there of - of parameters. As far as I know this is also a stripped down 
version of a published paper. 

Scientific discovery which relies on tuning of numbers sounds "wonderful" - 
perhaps it is just bad science & bad use of statistics. This sounds like 
picking one's tuning number to fit one's desired conclusions. 

Would it be appropriate to generate an error with stop(), or something more 
visible? Yes, it would br nice to "hand-hold" the fitting, but at least in the 
outlined usage in that vignette, a large number of models are thrown in with a 
lot of data to see "what sticks", hand-holding individual fits is neither 
feasible nor appropriate. 

(I didn't write that vignette - I am just checking through it as a small part 
of a bigger task.)

Hin-tak 

------------------------------
On Fri, Dec 13, 2013 13:01 GMT Terry Therneau wrote:

>I was sent a copy of the data, and this is what I get on a different machine:
> fit <- clogit(cc ~ addContr(A) + addContr(C) + addContr(A.C) + strata(set),
>         data=pscc)
>Warning messages:
>1: In fitter(X, Y, strats, offset, init, control, weights = weights,  :
>  Loglik converged before variable  1,2,3,4 ; beta may be infinite.
>2: In coxph(formula = Surv(rep(1, 13110L), cc) ~ addContr(A) + addContr(C) +  :
>  X matrix deemed to be singular; variable 5
>
> fit
>                     coef exp(coef) se(coef)         z  p
>addContr(A)2     -0.14250     0.867   110812 -1.29e-06  1
>addContr(C)2      0.00525     1.005   110812  4.74e-08  1
>addContr(A.C)1-2 -0.30097     0.740   110812 -2.72e-06  1
>addContr(A.C)2-1 -0.47712     0.621   110812 -4.31e-06  1
>addContr(A.C)2-2       NA        NA        0        NA NA
>
> xmat <- model.matrix(fit)
> svd(xmat)$d
>[1] 1.932097e+02 2.700101e+01 1.624731e+01 6.049630e-15 2.031334e-15
>
>The primary issue is that the covariates matrix is singular, having rank 3 
>instead of rank 5.
>The coxph routine prints two warning messages that things are not good about 
>the matrix. Warning messages should not be ignored!  The insane se(coef) 
>values in the printed result are an even bigger clue that the model fit is 
>suspect. Unfortunately, some small change in the iteration path or numerics 
>has put this data set over the edge from being seen as rank 3 (old run) to 
>rank 4 (new run).  Moral: coxph does pretty well at detecting redundat 
>variables, but if you know of some it never hurts to help the routine out by 
>removing them before the fit.
>
>Singularity of the X matrix in a Cox model is very difficult to detect 
>reliably; the current threshold is the result of long experience and 
>experiment to give as few false messages as possible.  (The RMS package in 
>particular used truncated power basis functions for the splines, which lead to 
>X matrices that look almost singular numerically, but are not.)  Setting a 
>little less stringent threshold for declaring singularity in the cholesky 
>decompostion sufficies for this data set.
>
>fit2 <- clogit(cc ~ addContr(A) + addContr(C) + addContr(A.C) + strata(set),
>         data=pscc, toler.chol=1e-10)
>
>I'll certainly add this to my list of test problems that I use to tune those 
>constants.
>
>Terry Therneau
>
>On 12/11/2013 09:30 PM, Hin-Tak Leung wrote:
> Here is a rather long discussion etc about freetype 2.5.2, problem with the 
> survival
> package, and build R 2.15.x with gcc 4.8.x. Please feel free to skip forward.
> 
> - freetype 2.5.2:
> 
> the fix to cope with one of the Mac OS X's system fonts just before the 
> release of
> freetype 2.5.1 caused a regression, crashing over one of Microsoft windows' 
> system fonts.
> So there is a 2.5.2. There are new 2.5.2 bundles for windows & Mac OS X. The 
> official
> win/mac binaries of R were built statically with 2+-years-old freetype with a 
> few known
> problems. Most should upgrade/rebuild.
> 
> http://sourceforge.net/projects/outmodedbonsai/files/R/
> 
> - problem with the survival package:
> 
> Trying to re-run a vignette to get the same result as two years ago
> reveal a strange change. I went and bisected it down to
> r11513 and r11516 of the survival package.
> 
> -------------- r11513 --------------------
> clogit(cc ~ addContr(A) + addContr(C) + addContr(A.C) + strata(set))
> 
> 
>                     coef exp(coef) se(coef)     z      p
> addContr(A)2     -0.620     0.538    0.217 -2.86 0.0043
> addContr(C)2      0.482     1.620    0.217  2.22 0.0270
> addContr(A.C)1-2 -0.778     0.459    0.275 -2.83 0.0047
> addContr(A.C)2-1     NA        NA    0.000    NA     NA
> addContr(A.C)2-2     NA        NA    0.000    NA     NA
> 
> Likelihood ratio test=26  on 3 df, p=9.49e-06  n= 13110, number of events= 
> 3524
> ------------------------------------------
> 
> ------------- r11516 ---------------------
> clogit(cc ~ addContr(A) + addContr(C) + addContr(A.C) + strata(set))
> 
> 
>                       coef exp(coef) se(coef)         z  p
> addContr(A)2     -0.14250     0.867   110812 -1.29e-06  1
> addContr(C)2      0.00525     1.005   110812  4.74e-08  1
> addContr(A.C)1-2 -0.30097     0.740   110812 -2.72e-06  1
> addContr(A.C)2-1 -0.47712     0.621   110812 -4.31e-06  1
> addContr(A.C)2-2       NA        NA        0        NA NA
> 
> Likelihood ratio test=26  on 4 df, p=3.15e-05  n= 13110, number of events= 
> 3524
> ------------------------------------------
> 

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to