First, here is your message as it appears on R-help. On 10/14/2012 05:00 AM, r-help-requ...@r-project.org wrote: > I?m trying to set up proportional hazard model that is stratified with > respect to covariate 1 and has an interaction between covariate 1 and > another variable, covariate 2. Both variables are categorical. In the > following, I try to illustrate the two problems that I?ve encountered, using > the lung dataset. > > > > The first problem is the warning: > > > > To me, it seems that there are too many dummies generated. > > The second problem is the error: > Please try to fix this in the future (Nabble issue?)
As to the problems: handling strata by covariate interactions turns out to be a bit of a pain in the posteriorin the survival code. It would have worked, however, if you had done the following: fit <- coxph(Surv(time, status) ~ strata(cov1) * cov2, data=...) or ~ strata(cov1):cov2 or ~ strata(cov1):cov2 + cov2 But by using ~ strata(cov1) + cov1:cov2 you fooled the program into thinking that there was no strata by covariate interaction, and so it did not follow the special logic necessary for that case. Second issue: The model.matrix function of R, common to nearly all the modeling functions (including coxph) tries to guess which dummy variables will be redundant, and thus can be removed from the X matrix before the fit. Such an approach is doomed to failure. I'm actually surprised at how often R guesses correctly, because until a matrix decomposition is actually performed the only thing possible is an informed guess. Your particular case gives rise to a larger than usual number of NA coefs (redundant columns), but short of building your own X matrix by hand there isn't anything to be done about it. Just ignore them. Terry Therneau [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.