--- begin included message --- I'm trying to run a conditional logistic regression in R (2.14.0) using clogit from the survival package. The dataset I have is relatively small (300 observations) with 25 matched strata- there are roughly 2 controls for each case, and some strata have multiple case/control groups. When I try to fit a very simple model with a binary outcome and a single continuous exposure R seems to freeze for a while, and 30 minutes later I receive the results. However, when I run the exact same conditional logistic regression model in STATA 10, the exact same answers are produced in <1 second. (Same coefficients, LR test resutls, standard errors, etc.). I just tried running an expanded model with covariates that I'd like to control for, but R has been unresponsive and I doubt it will resolve itself anytime soon. The syntax I'm using is:
library(survival) clogit(binary_out~contin_exp + strata(id), data=data) -------------- end inclusion -------------- Quick answer: use method='approximate' Long answer: A stratified Cox model with time=constant, status:1=case,0=control, and using the "exact partial likelihood" has exactly the same likelihood formula as a conditional logistic regression. I've never settled my mind on whether this equivalence is a deep statistical connection or just lucky algebra. Nevertheless, the clogit routine does it's work by creating a dummy variable of times (all 1, but any constant would do) and then calling coxph. In coxph, the computation of the exact partial likelihood can be very, very, very slow. If a particular strata had say 10 events and 20 subjects it has to add up a denominator that involves all possible choices of 10 out of 20, which is 20!/(10! 10!) = 184756 terms. Most of the time conditional logisitic is run on case/control pairs or triplets (case + 2 controls) which involves choosing 1 out of n and this issue does not arise. There is a very clever but complex algorithm that involves nested subsets, I forget the authors, which some packages have implemented and which speeds this up considerably. From your comments above I'm guessing Stata is on that list. The survival package in R is not. a. The case comes up rarely, and when it does a coxph call using the 'efron' approximation is usually very close. A model with the 'breslow' approx is usually ok. (The clogit routine invokes exact or Breslow, I think for historical reasons, but it is one of few parts of survival that I didn't write so I can't be definitive.) b. Yup, it should be added someday, and it's on my list. But there are 10-15 other things ahead of it, and it's been that way for 15 years. Volunteers welcomed. Terry Therneau ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.