It turns out that the issue is ties in survival times. Repeating the data
produces tied failure times; weighting does not, so you get different results.
The effect is unusually large here, perhaps because of the small sample size.
If you use the (less accurate) Breslow correction for ties, you do get the same
answer for both data sets.
coxph(Surv(time,status)~x+strata(sex),data=test,weights=wt,method="breslow")
Call:
coxph(formula = Surv(time, status) ~ x + strata(sex), data = test,
weights = wt, method = "breslow")
coef exp(coef) se(coef) z p
x 1.01 2.73 0.734 1.37 0.17
Likelihood ratio test=1.99 on 1 df, p=0.159 n=6 (1 observation deleted due to
missingness)
coxph(Surv(time,status)~x+strata(sex),data=test_freq,method="breslow")
Call:
coxph(formula = Surv(time, status) ~ x + strata(sex), data = test_freq,
method = "breslow")
coef exp(coef) se(coef) z p
x 1.01 2.73 0.734 1.37 0.17
Likelihood ratio test=1.99 on 1 df, p=0.159 n=18 (3 observations deleted due
to missingness)
-thomas
On Fri, 13 Jun 2008, mah wrote:
I am confuse by the results of the weights option for coxph. I
replicated each row three times from the help page for coxph in the
data frame test_freq. I had expected that the coefficients,
significance tests, and tests of non-proportionality would yield the
same results for the replicated and non-replicated data, but the
output below shows differences in all three metrics. Is this the
result of a curved response variable? This is likely more of a
conceptual question than a language question, but all help is
sincerely appreciated.
Mike
test1
$time
[1] 4 3 1 1 2 2 3
$status
[1] 1 NA 1 0 1 1 0
$x
[1] 0 2 1 1 1 0 0
$sex
[1] 0 0 0 0 1 1 1
$wt
[1] 3 3 3 3 3 3 3
test_freq
time status x sex
1 4 1 0 0
2 4 1 0 0
3 4 1 0 0
4 3 NA 2 0
5 3 NA 2 0
6 3 NA 2 0
7 1 1 1 0
8 1 1 1 0
9 1 1 1 0
10 1 0 1 0
11 1 0 1 0
12 1 0 1 0
13 2 1 1 1
14 2 1 1 1
15 2 1 1 1
16 2 1 0 1
17 2 1 0 1
18 2 1 0 1
19 3 0 0 1
20 3 0 0 1
21 3 0 0 1
t1 <- coxph( Surv(time, status) ~ x + strata(sex), data=test1, weights=wt)
summary(t1)
Call:
coxph(formula = Surv(time, status) ~ x + strata(sex), data = test1,
weights = wt)
n=6 (1 observation deleted due to missingness)
coef exp(coef) se(coef) z p
x 1.17 3.22 0.744 1.57 0.12
exp(coef) exp(-coef) lower .95 upper .95
x 3.22 0.311 0.749 13.8
Rsquare= 0.353 (max possible= 0.999 )
Likelihood ratio test= 2.61 on 1 df, p=0.106
Wald test = 2.47 on 1 df, p=0.116
Score (logrank) test = 2.67 on 1 df, p=0.102
cox.zph(t1)
rho chisq p
x -0.0716 0.00598 0.938
t_freq <- coxph( Surv(time, status) ~ x + strata(sex), data=test_freq)
summary(t_freq)
Call:
coxph(formula = Surv(time, status) ~ x + strata(sex), data =
test_freq)
n=18 (3 observations deleted due to missingness)
coef exp(coef) se(coef) z p
x 1.41 4.09 0.756 1.86 0.063
exp(coef) exp(-coef) lower .95 upper .95
x 4.09 0.245 0.929 18.0
Rsquare= 0.185 (max possible= 0.879 )
Likelihood ratio test= 3.69 on 1 df, p=0.0549
Wald test = 3.47 on 1 df, p=0.0626
Score (logrank) test = 3.84 on 1 df, p=0.0499
cox.zph(t_freq)
rho chisq p
x -0.0697 0.0526 0.819
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Thomas Lumley Assoc. Professor, Biostatistics
[EMAIL PROTECTED] University of Washington, Seattle
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.