On Sep 15, 2009, at 9:58 PM, Bryan Hanson wrote:
R Folk:
Please forgive what I'm sure is a fairly naïve question; I hope it's
clear.
A colleague and I have been doing a really simple one-off survival
analysis,
but this is an area with which we are not very familiar, we just
happen to
have gathered some data that needs this type of analysis. We've
done quite
a bit of reading, but answers escape us, even though the question
below
seems simple.
Considering the following example from ?survdiff:
survdiff(Surv(time, status) ~ pat.karno, data=lung)
Call:
survdiff(formula = Surv(time, status) ~ pat.karno, data = lung)
n=225, 3 observations deleted due to missingness.
N Observed Expected (O-E)^2/E (O-E)^2/V
pat.karno=30 2 1 0.658 0.1774 0.179
pat.karno=40 2 1 1.337 0.0847 0.086
pat.karno=50 4 4 1.079 7.9088 8.013
pat.karno=60 30 27 15.237 9.0808 10.148
pat.karno=70 41 31 26.264 0.8540 1.027
pat.karno=80 51 39 40.881 0.0865 0.117
pat.karno=90 60 38 49.411 2.6354 3.853
pat.karno=100 35 21 27.133 1.3863 1.684
Chisq= 22.6 on 7 degrees of freedom, p= 0.00202
The p value here is for the entire group (right?). How do we go about
determining the p value for the comparison of any four arbitrary
groups in
all combinations, say pat.karno = 40, 60, 80, and 100?
We know (we think) that we can't just run the coxph analysis for the
only
the groups of interest, as the hazard ratio for any one group in an
analysis
with several groups is computed by holding the other groups at their
average
value, so the hazard ratio varies by the context.
Seems like we need some sort of t-test or chi-squared test, but
being mere
chemists and molecular biologists, we don't quite see it and
wouldn't trust
ourselves anyway, given the special nature of survival analysis.
Manual
instructions or a function suggestion would be great.
The column labeled (O-E)^2/V should be distributed (at least if these
were large
samples) as a chi-square statistic with 1 d.f. each. You've got 3 in
there that
exceed the nominal 95th%ile of a X^2, but you really ought to be
looking at the
actual to expected looked along the ordinal scale. "pat.karno" is
pretty clearly,
(at least to this physician) the Karnofsky Performance score, and
since that is
ordinal, you probably should be doing a trend test. Perhaps you could
set up
pat.karno as an ordered variable and see if survdiff gives you a
meaningful output.
If you are dealing with an analysis that might affect people's lives,
wouldn't it
be ethically safer to involve a real statistician?
--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.