Hello, I am apparently confused about the use of an id parameter for an event history/survival model, and why the EHA documentation for aftreg does not specify one. All assistance and insights are appreciated.
Attempting to specifiy an id variable with the documentation example generates an "overlapping intervals" error, so I sorted the original mort dataframe and set subsequent entry times an id to the previous exit time + 0.0001. This allowed me to see the affect of the id parameter on the coefficients and significance tests, and prompted my question. The code I used is shown below, with the results at the bottom. Thanks in advance! Mike head(mort) ## data clearly contains multiple entries for some of the dataframe ids no.id.aft <- aftreg(Surv(enter, exit, event) ~ ses, data = mort) ## Inital model id.aft <- aftreg(Surv(enter, exit, event) ~ ses, data = mort, id=id) ## overlapping intervals error mort.sort <- ## ensure records ordered mort[ order(mort$id, mort$enter),] ## remove overlap for (i in 2:nrow(mort.sort)){ if (mort.sort[i,'id'] == mort.sort[i-1,'id']) mort.sort[i,'enter'] <- mort.sort[i-1, 'exit'] + 0.0001 } no.id.aft.sort <- aftreg(Surv(enter, exit, event) ~ ses, data = mort.sort) ## initial model on modified df id.aft.sort <- aftreg(Surv(enter, exit, event) ~ ses, id=id, data = mort.sort) ## with id parameter #=== output ===========# > no.id.aft.sort Call: aftreg(formula = Surv(enter, exit, event) ~ ses, data = mort.sort) Covariate W.mean Coef Exp(Coef) se(Coef) Wald p ses lower 0.416 0 1 (reference) upper 0.584 -0.347 0.707 0.089 0.000 log(scale) 3.603 36.704 0.065 0.000 log(shape) 0.331 1.393 0.058 0.000 Events 276 Total time at risk 17045 Max. log. likelihood -1391.4 LR test statistic 16.1 Degrees of freedom 1 Overall p-value 6.04394e-05 > id.aft.sort Call: aftreg(formula = Surv(enter, exit, event) ~ ses, data = mort.sort, id = id) Covariate W.mean Coef Exp(Coef) se(Coef) Wald p ses lower 0.416 0 1 (reference) upper 0.584 -0.364 0.695 0.090 0.000 log(scale) 3.588 36.171 0.065 0.000 log(shape) 0.338 1.402 0.058 0.000 Events 276 Total time at risk 17045 Max. log. likelihood -1390.8 LR test statistic 17.2 Degrees of freedom 1 Overall p-value 3.3091e-05 > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.