Dear friends, I have used R for some time now and have a tricky question about the coxph-function: To sum it up, I am not sure whether I can use coxph in conjunction with missing covariate data in a model with time-variant covariates. The point is: I know how "old" every piece that I oberserve is, but do not have fully historical information about the corresponding covariates. Maybe you have some advice for me, although this problem might only be 70% R and 30% statistically-related. Here's a detailled explanation:
SITUATION & OBJECTIVE: I want to analyze the effect of environmental effects (i.e. temperature and humidity) on the lifetime of some wear-parts. The study should be conducted on a yearly basis, meaning that I have collected empirical data on every wearpart at the end of every year. DATA: I have collected the following data: - Status of the wear-part: Equals "0" if part is still alive, equals "1" if part has "died" (my event variable) - Environmental data: Temperature and humidity have been measured at each of the wear-parts on a yearly basis (because each wear-part is at a different location, I have different data for each wear-part) PROBLEM: I started collecting data between 2001 and 2007. In 2001, a vast amount of of wearparts has already been in use. I DO KNOW for every part how long it has been used (even if it was employed before 2001), but I DO NOT have any information about environmental conditions like temperature or humidity before 2001 (I call this semi-left-censored). Of course, one could argue that I should simply exclude these parts from my analysis, but I don't want to loose valuable information, also because the amount of "new parts" that have been employed between 2001 and 2007 is rather small. Additionally, I cannot make any assumption about the underlying lifetime distribution. Therefore I have to use a non-parametrical model for estimation (most likely cox). QUESTION: >From an econometric perspective, is it possible to use Cox Proportional Hazard model in this setting? As mentioned before, I have time-variant covariates for each wearpart, as well as what I call "semi-left-censored" data that I want to use. If not, what kind of analysis would you suggest? Thanks a lot for your great help, I really appreciate it. All the best Philipp ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.