Terry,

My point was that if you are asking the question:  What is the average time
to death based on a set of variables? The only logical approach for
calculating actual time to death is to use uncensored cases, because we do
not know the time to death for the censored cases and can only estimate
them.  While actual time to death for uncensored cases may not be a very
useful piece of information, it can indeed be calculated.  However, as you
point out predicted values for time to death can be estimated using the
survival function which incorporates both censored and uncensored data.
However, the assumption of proportional hazards is rarely defensible.

Best,

Jim

On Fri, Nov 12, 2010 at 12:09 PM, Terry Therneau <thern...@mayo.edu> wrote:

> Since I read the list in digest form (and was out ill yesterday) I'm
> late to the discussion.
>
> There are 3 steps for predicting survival, using a Cox model:
>
> 1. Fit the data
>  fit <- coxph(Surv(time, status) ~ age + ph.ecog, data=lung)
>
> The biggest question to answer here is what covariates you wish to base
> the prediction on.  There is the usual tradeoff between too few (leave
> out something important) or too many (including unimportant things).
>
> 2. Get survival curves
>  curves <- survfit(fit, newdata= _____)
> The newdata needs to include all the covariates in your model.
>
> 3. Summarize
>  Note that you don't get a single number prediction for each subject,
> you get a set of survival curves.  plot(curves[1]) for instance shows
> you the first one, plot(curves[2]) the second.
>  print(curves) will give a 1 line per curve summary including the
> median, and optionally one of several versions of the mean. See the
> discussion in help(print.survfit).  The mean is rarely used as a summary
> due to the fact that we don't see the whole distribution.  (Use temp<-
> summary(curves); temp$table to use the printout values in further
> calculations.)
>
> -------------------
>
>  The same process applies for parametric survival using survreg.  In
> return for specifying a distributional form, the predicted survival
> curve for a particular subject is completely defined.  This includes the
> mean and all quantiles.  Reliablity analysis (survival analysis in
> industry) uses parametric almost exclusively, since the tail of the
> distribution is of greatest interest.  Your use of
> predict(,type='response') is almost correct, there is just the math
> detail that the Weibull fits on a log scale, so the returned value is a
> geometric mean time to death rather than an arithmetic mean.
>
>  The suggestion to use ordinary regression on the observed times is
> wrong.  Censored data is more complex than that.
>
> Terry Therneau
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
*James C. Whanger
Research Consultant
2 Wolf Ridge Gap
Ledyard, CT  06339

Phone: 860.389.0414*

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to