This is a follow-up on my note of Saturday. Let me start with two important clarifications - I think this would be a nice addition, but I've had exactly one use for it in the 15+ years of developing the survival package. - I have a work around for the current case. Prioritize accordingly.
The ideal would be to change survexp as follows: fit <- survexp( ~ gender, data=mydata, ratetable=survexp.us, ratevar=list(sex=gender, year=enroll.dt, age=age*365.25)) The model statement says that I want separate curves by gender, and is similar to other model statements. The ratevar option gives the mapping between my variable names and the dimnames of the survexp.us rate table. It wants age in days, enrollment date to be some sort of date object, and sex to be a factor. Then the heading of the R code would be m <- match.call() m <- m[c(1, match(names(m), c('data','formula','na.action', 'subset', 'weights', 'ratevar'), nomatch=0) m[[1]] <- as.name('model.frame') m <- eval.parent(m) That is, the variables enroll.dat and age are searched for in the data= arg. This is like the start opion in glm, but a more complex result than a vector. The model.frame function can't handle this. (Splus fails too, same spot, less useful error message). ----- The current code uses fit <- survexp(~ gender + ratetable(sex=gender, year=enroll.dt, age=age*365.25), data=mydata, ratetable=survexp.us) The ratetable function creates a matrix with extra attributes. The matrix contains as.numeric of the factors with the levels remembered as an extra attribute, and also looks out for dates. So the result is like ns() in the eyes of model.frame, and it works. But having to write gender twice on the rhs is confusing to users. Thanks in advance for any comments. Terry Therneau ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel