The following variation of what you proposed will allow you to either subset the dataset outside coxph or to use the subset argument:
subwrap5 <- function(x, sb=NULL) { coxph(Surv(times,event)~trt, data = x, subset = sb) } subwrap5(testdf, testdf$sex == 'F') subwrap5(testdf[testdf$sex == 'F', ]) I'm not sure whether anyone of these would be more efficient with memory usage - if this is an issue you could test with large datasets. -Christos > -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Erik Iverson > Sent: Friday, November 09, 2007 12:05 PM > To: '[EMAIL PROTECTED]' > Subject: [R] wrapper for coxph with a subset argument > > Dear R-help - > > Thanks to those who replied yesterday (Christos H. and Thomas > L.) regarding my question on coxph and model formula, the > answers worked perfectly. > > My new question involves the following. > > I want to run several coxph models (package survival) with > the same dataset, but different subsets of that dataset. > > I have found a way to do this, described below in functions > subwrap1 and subwrap2. These do not use the coxph "subset" > argument, however, as you will see. > > My three main questions are : > > 1) When writing a wrapper like this, should I be using the > subset argument in coxph(), or alternatively, doing what I am > doing in subwrap1 and subwrap2 below? Is the subset argument > in coxph more of a convenience tool for interactive use > rather than programs? > > 2) If the approach in subwrap1 and subwrap2 is fine, is there > a preference for using 'expressions' or 'strings'? > Eventually, my program will create these subset conditions > programmatically, so I think strings will be the way I have > to go, even though I've seen warnings on this list about > using the eval(parse()) construct. > > 3) Is there some approach to do this that I'm overlooking? > My goal will be to produce a list of subset conditions > (probably a character vector), and then use lapply to run the > various cox regressions. > > I can already achieve my goal, I just would like to know more > details about how others do things like this. > > I've simplified my code below to focus on where I feel I'm confused. > Here is some code along with comments: > > #### BEGIN R SAMPLE CODE > > #Function for producing test data > makeTestDF <- function(n) { > times <- sample(1:200, n, replace = TRUE) > event <- rbinom(n, 1, prob = .1) > trt <- rep(c("A","B"), each = n/2) > sex <- factor(c("M","F")) > sex <- rep(sex, times = n/2) > testdf <- data.frame(times,event,trt,sex) } > > # Make test data, n = 200 > testdf <- makeTestDF(200) > > # Cox wrapper function with subset, this one works # Takes > subset as expression > subwrap1 <- function(x, sb) { > sb <- eval(substitute(sb), x) > x <- x[sb,] > coxph(Surv(times,event)~trt, data = x) } > > subwrap1(testdf, sex == 'F') > > # This next one also works, but uses a character variable # > instead of an expression as the subset argument > > subwrap2 <- function(x, sb) { > sb <- eval(parse(text = sb), x) > x <- x[sb,] > coxph(Surv(times,event)~trt, data = x) } > > subwrap2(testdf, "sex == 'F'") > > # Neither of the above use the coxph subset argument # If I > try using that, I get stuck with expressions, # I've tried > many # different things in the subset argument, but none # > seem to do the trick. Is using this argument in a # program > even advisable? > > subwrap3 <- function(x, sb) { > coxph(Surv(times,event)~trt, data = x, > subset = eval(substitute(sb), x)) > } > > subwrap3(testdf, sex == 'F') #does not work > > # Using a string, this works, however. > > subwrap4 <- function(x, sb) { > coxph(Surv(times,event)~trt, data = x, subset = > eval(parse(text=sb))) } > > subwrap4(testdf, "sex == 'F'") > > ### END R SAMPLE CODE > > Thanks so much, > Erik Iverson > [EMAIL PROTECTED] > > > sessionInfo() > R version 2.5.1 (2007-06-27) > i686-pc-linux-gnu > > locale: > LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLA > TE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8 > ;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC > _MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C > > attached base packages: > [1] "grDevices" "datasets" "tcltk" "splines" > "graphics" "utils" > [7] "stats" "methods" "base" > > other attached packages: > debug mvbutils SPLOTS_1.2-6 Hmisc chron > survival > "1.1.0" "1.1.1" "1.2-6" "3.4-2" "2.3-13" > "2.32" > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.