On Sat, 28 Mar 2009, Bob Green wrote:


Hello,

I am hoping for assistance in regards to examining the contribution of stratified variables in a cox regression. A previous post by Terry Therneau noted that "That is the point of a strata; you are declaring a variable toNOT be proportional hazards, and thus there is no single "hazard ratio" that describes it". Given this purpose of stratification, in the process of building and testing a model, is there a way to test if the stratified variables do add anything to a model?

I'm not aware of any formal test for whether stratification helps. It's 
difficult because you are adding an infinite-dimensional parameter to the 
model, and this parameter doesn't even appear in the partial likelihood. 
Nothing simple is going to work.

In principle one could compare the two stratum baseline cumulative hazards to 
see if they were proportional to each other, eg, see if  the difference in 
log-cumulative baseline hazard was constant over time. The bootstrap is valid 
for the baseline cumulative hazards, so one could get confidence intervals on a 
suitable summary statistic that way.


Two variables were stratified because it was considered that the proportional hazards assumption was not met (via inspection of log-log plots where the curves crossed. I have examined. There were no cox.zph values that were statistically significant. I did produce plots but found these difficult to interpret).

There isn't much information loss in stratifying, as long as it's not overdone, 
which is probably why there hasn't been much work on tests.  The main loss is 
that the model becomes more complicated and harder to summarize.

The statistician I have been consulting said that in SPSS when variables are stratified a model is produced for each different strata (e.g a separate analysis for male and female if a gender variable were stratified). I have not seen this approach used in R examples I have seen.

Fitting a completely separate model for each stratum is equivalent to 
stratifying *and* adding a interaction with stratum to each predictor variable. 
 This does result in a loss of information, and is usually overkill.  You can 
add stratum interactions just to the variables where they are needed.

This may be related to the collision in terminology where epidemiologists say 
'stratify' to mean 'do a completely separate analysis' and statisticians say 
'stratify' to mean 'pool the stratum-specific analyses to get an overall 
estimate'.


       -thomas

Thomas Lumley                   Assoc. Professor, Biostatistics
tlum...@u.washington.edu        University of Washington, Seattle

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to