Hi All I found survfit function was very slow for a large dataset and I am looking for an alternative way to quickly get the predicted survival probabilities. My historical data set is a pool of loans with monthly observed default status for 24 months. I would like to fit the proportional hazard model with time varying covariate such as unemployment rates and time constant variables at loan application in a counting process format, and then use the model to predict the probability of default in each month during next 2 years for a pool of new loans. I have read some posts from other R users. It sounds like using (average survival probability)^exp((X-means(X)*Beta) can quickly get the predicted survival probabilities. My predictors for the model include both continuous variables and categorical variables and my dataset is in counting process format with both time varying and time constant predictors. So how should I take the mean? I guess itÂ’s the mean of training data? And the denominator for the mean is the number of observations (i.e, the number of rows of training data in the counting process format)? What if the predictor is a categorical variable? Any comments and suggestions are greatly appreciated. Thanks! Ying [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.