There a numerous issues, some of which David has pointed out. I will add some 
and address some:

1. As far as I understand, you look at only one population. For a survival 
model, you would need an indicator when the species was extinguished (rather 
than a probability). However, with only one extinguishing point in time, this 
model is nonsense.

2. Your dependent variable, however, is decline (or rather probably a 
prediction of the percentage of the existing population relative to its 
baseline at date t=0; that would be my guess). Echoing David, what was this 
logistic regression (what was the model)? Is this derived from a count of the 
animals in each time period? You may create all sorts of issues by doing that 
(issues that can bias your result) and be better off by working on the original 
data. Please provide us with more info on your dependent variable and what this 
logistic regression was.

3. Your current dependent variable has time-series nature. So you may be facing 
autocorrelation of the error term among observations. My best guess is that you 
better model this as a time series, but again, we need more information.

4. As for the missing variables. There are several ways to address this issue. 
1st. Imputation (this is probably not the right way to go, when large amounts 
of data are missing, and there is a host of literature on imputation). 2nd. 
Missing variable coding (You create a second variable, a missing-value 
indicator, for each variable that contains NAs. The missing variable indicator 
you code 1 if the underlying variable is NA and 0 if the underlying variable 
has a numeric value. All NAs in the underlying variable you recode to 0.)

Example for missing variable coding (Oxygen = variable with NAs, Recoded = 
recoded oxygen variable, MVI = missing variable indicator

Oxygen Recoded MVI

3       3       0
5       5       0
NA      0       1
NA      0       1
6       6       0
NA      0       1
4       0       0

If the data is missing at random, the coefficient on the MVI indicator should 
be insignificant. If it comes out significant, it will tell you that something 
about obs for which your data is missing is different than for the year for 
which you have observed the independent variables. But that requires us to 
figure out which model to use in the first place.

Best,
Daniel

        
-------------------------
cuncta stricte discussurus
-------------------------
-----Original Message-----
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of FishR
Sent: Wednesday, February 17, 2010 1:55 PM
To: r-help@r-project.org
Subject: [R] Survival analysis


Dear all
I have a dataset examining the probability of a population surviving
(calculated from a logistic regression) of a species over a 200yr period.
The predictor variables are either continuous but non-normal (e.g.
temperature, oxygen) or categorical (e.g. channelisation), unfortunately I
also have a large amount of missing values.  

Year    Decline Temperature     Oxygen  Channelisation
1800    0.947758115     36.6    NA      NA
1801    0.946135961     25.2    NA      NA
1802    0.944466388     28.5    NA      NA
1803    0.942748196     35.5    NA      NA
1804    0.940980166     33      NA      NA
1805    0.93916106      30.2    NA      NA
truncated …
1999    0.028531339     10.5    NA      5
2000    0.027649801     8.4     NA      5    

I have been trying to run a Cox Proportional Hazards Model with the code

model<-coxph(Surv(Year, Decline) ~ Temperature + Oxygen + Channelisation)

but keep getting an error message ‘Invalid status value’. 

Have I inputted the data in the wrong format or am I trying to run a totally
unsuitable model? 

Any help would be greatly appreciated 
Tom   

-- 
View this message in context: 
http://n4.nabble.com/Survival-analysis-tp1559155p1559155.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to