[R] Time Dependent Cox Model

2009-10-13 Thread quaildoc

I am having trouble formatting some survival data to use in a time dependent
cox model. My time dep. variable is habitat and I have it recorded for every
day (with some NAs).  I think it is working properly except for calculating
the death.time. This column should be 1s or 0s and as I have it only
produces 0s.  Any help will be greatly appreciated.


http://www.nabble.com/file/p25881478/Survival_master2.csv
Survival_master2.csv 



 Here is my code:
sum(!is.na(surv[,16:726]))

surv2<-matrix(0,12329,19)
colnames(surv2)<-c('start', 'stop', 'death.time',
names(surv)[1:15],'habitat')
row<-0 # set record counter to 0
for (i in 1:nrow(surv)) { # loop over individuals
for (j in 16:726) { # loop over 726 days
  if (is.na(surv[i, j])) next # skip missing data
  else {
row <- row + 1 # increment row counter
start <- j - 11 # start time (previous day)
stop <- start + 1 # stop time (day)
death.time <- if (stop == surv[i, 4] && surv[i, 5] ==1) 1 else 0
   # construct record:
surv2[row,] <- c(start, stop, death.time, unlist(surv[i, c(1:15,
j)]))
}
}
   }
surv2<-as.data.frame(surv2)
-- 
View this message in context: 
http://www.nabble.com/Time-Dependent-Cox-Model-tp25881478p25881478.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Time Dependent Cox Model

2009-10-14 Thread quaildoc

Does anyone have suggestions? Thanks!

quaildoc wrote:
> 
> I am having trouble formatting some survival data to use in a time
> dependent cox model. My time dep. variable is habitat and I have it
> recorded for every day (with some NAs).  I think it is working properly
> except for calculating the death.time. This column should be 1s or 0s and
> as I have it only produces 0s.  Any help will be greatly appreciated.
> 
> 
>  http://www.nabble.com/file/p25881478/Survival_master2.csv
> Survival_master2.csv 
> 
> 
> 
>  Here is my code:
> sum(!is.na(surv[,16:726]))
> 
> surv2<-matrix(0,12329,19)
> colnames(surv2)<-c('start', 'stop', 'death.time',
> names(surv)[1:15],'habitat')
> row<-0 # set record counter to 0
> for (i in 1:nrow(surv)) { # loop over individuals
> for (j in 16:726) { # loop over 726 days
>   if (is.na(surv[i, j])) next # skip missing data
>   else {
> row <- row + 1 # increment row counter
> start <- j - 11 # start time (previous day)
> stop <- start + 1 # stop time (day)
> death.time <- if (stop == surv[i, 4] && surv[i, 5] ==1) 1 else
> 0
># construct record:
> surv2[row,] <- c(start, stop, death.time, unlist(surv[i,
> c(1:15, j)]))
> }
> }
>}
> surv2<-as.data.frame(surv2)
> 

-- 
View this message in context: 
http://www.nabble.com/Time-Dependent-Cox-Model-tp25881478p25893488.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Time Dependent Cox Model

2009-10-14 Thread quaildoc

Some suggested that go into more detail on what I wanted to accomplish and
the rest of my code.  I want to accomplish exactly what Fox did in this
article( http://www.nabble.com/file/p25897307/appendix-cox-regression.pdf
appendix-cox-regression.pdf ) (starting with page 7), except using "habitat"
instead of employment. I want habitat to be a time dep. covariate and it
varys by day.

I read in my data as the csv. file, and one major difference in the data set
Fox used and min is I have a DaysatRisk column instead of the "week" the
person went back to jail. This I think is the root of my problem calculating
the proper death.time.  The death.time column should be 1s and 0s that
corresponds to the day the animal died. 

Thanks in advance,



surv<-read.csv("Survival_master2.csv", header = TRUE)

sum(!is.na(surv[,16:726]))

surv2<-matrix(0,12329,19)
colnames(surv2)<-c('start', 'stop', 'death.time',
names(surv)[1:15],'habitat')
row<-0 # set record counter to 0
for (i in 1:nrow(surv)) { # loop over individuals
for (j in 16:726) { # loop over 52 weeks
  if (is.na(surv[i, j])) next # skip missing data
  else {
row <- row + 1 # increment row counter
start <- j - 11 # start time (previous week)
stop <- start + 1 # stop time (current week)
death.time <- if (stop == surv[i, 4] && surv[i, 5] ==1) 1 else 0
   # construct record:
surv2[row,] <- c(start, stop, death.time, unlist(surv[i, c(1:15,
j)]))
}
}
   }
surv2<-as.data.frame(surv2)
remove(i,j,row,start,stop,death.time)

surv2[1:15,]

test<-coxph(Surv(start,stop,death.time)~habitat, data=surv2)


JorisMeys wrote:
> 
> Well,
> 
> it might be wise to elaborate a bit more about the variables and what
> exactly you want e.g. death-time to be. I'd interprete it as time of
> death, but the fact that it is 0/1, means it is a logical (?) binary
> variable of some sort.
> 
> Please ask your question in such a way that somebody who doesn't know
> the dataset and your research, can still understand what is inside the
> dataset and what exactly you're trying to obtain.
> 
> I'd also suggest to add the command to read in the data. I don't have
> the time to spend looking around how exactly I can read in the dataset
> in such a way it fits what you have in your workspace.
> 
> Cheers
> Joris
> 
> On Wed, Oct 14, 2009 at 5:37 PM, quaildoc  wrote:
>>
>> Does anyone have suggestions? Thanks!
>>
>> quaildoc wrote:
>>>
>>> I am having trouble formatting some survival data to use in a time
>>> dependent cox model. My time dep. variable is habitat and I have it
>>> recorded for every day (with some NAs).  I think it is working properly
>>> except for calculating the death.time. This column should be 1s or 0s
>>> and
>>> as I have it only produces 0s.  Any help will be greatly appreciated.
>>>
>>>
>>>  http://www.nabble.com/file/p25881478/Survival_master2.csv
>>> Survival_master2.csv
>>>
>>>
>>>
>>>  Here is my code:
>>> sum(!is.na(surv[,16:726]))
>>>
>>> surv2<-matrix(0,12329,19)
>>> colnames(surv2)<-c('start', 'stop', 'death.time',
>>> names(surv)[1:15],'habitat')
>>> row<-0 # set record counter to 0
>>>     for (i in 1:nrow(surv)) { # loop over individuals
>>>         for (j in 16:726) { # loop over 726 days
>>>           if (is.na(surv[i, j])) next # skip missing data
>>>           else {
>>>             row <- row + 1 # increment row counter
>>>             start <- j - 11 # start time (previous day)
>>>             stop <- start + 1 # stop time (day)
>>>             death.time <- if (stop == surv[i, 4] && surv[i, 5] ==1) 1
>>> else
>>> 0
>>>                    # construct record:
>>>             surv2[row,] <- c(start, stop, death.time, unlist(surv[i,
>>> c(1:15, j)]))
>>>             }
>>>         }
>>>    }
>>> surv2<-as.data.frame(surv2)
>>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Time-Dependent-Cox-Model-tp25881478p25893488.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide com

[R] Prediction Error Calculation

2009-10-23 Thread quaildoc

Hello List,

I am fitting a logistic regression model for some presence/absence type
data.  I have numerous covariates I am fitting to explain variation, and I
am using AIC to rank models.  However, I would like to report how well my
best model (s) do at prediction.  I have looked over the archives and the
web and have come up with something that gives me what I think is the mean
prediction error, BUT I am not sure of that. I am sort of unfamiliar with
these types of statistics.  Here is my code:


metrics.global<-glm(Type~MPI+IJI+ED+PRD+class2+class3+class5,
family=binomial, data=metrics)## ##Type is my binary response 0 or 1

muhat<-metrics.global$fitted.values
##assigns the fitted values a name muhat
global.diag<-glm.diag(metrics.global)
##creates a the diagnostic values
cv.err<-mean((metrics.global$y-muhat)^2/(1-global.diag$h)^2)
###calculates cv.err
cv.err


My main problem is I am unsure how to interpret what cv.err means for my
model.  I know that h is a leverage statistic for each observation.  I would
appreciate some interpretation clarification.

Thank you.




-- 
View this message in context: 
http://www.nabble.com/Prediction-Error-Calculation-tp26031236p26031236.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Prediction Error Calculation

2009-10-26 Thread quaildoc

Any suggestions?

quaildoc wrote:
> 
> Hello List,
> 
> I am fitting a logistic regression model for some presence/absence type
> data.  I have numerous covariates I am fitting to explain variation, and I
> am using AIC to rank models.  However, I would like to report how well my
> best model (s) do at prediction.  I have looked over the archives and the
> web and have come up with something that gives me what I think is the mean
> prediction error, BUT I am not sure of that. I am sort of unfamiliar with
> these types of statistics.  Here is my code:
> 
> 
> metrics.global<-glm(Type~MPI+IJI+ED+PRD+class2+class3+class5,
> family=binomial, data=metrics)## ##Type is my binary response 0 or 1
> 
> muhat<-metrics.global$fitted.values
> ##assigns the fitted values a name muhat
> global.diag<-glm.diag(metrics.global)
> ##creates a the diagnostic values
> cv.err<-mean((metrics.global$y-muhat)^2/(1-global.diag$h)^2)
> ###calculates cv.err
> cv.err
> 
> 
> My main problem is I am unsure how to interpret what cv.err means for my
> model.  I know that h is a leverage statistic for each observation.  I
> would appreciate some interpretation clarification.
> 
> Thank you.
> 
> 
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Prediction-Error-Calculation-tp26031236p26066845.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Prediction Error Calculation

2009-10-29 Thread quaildoc

Any help would be appreciated.

quaildoc wrote:
> 
> Hello List,
> 
> I am fitting a logistic regression model for some presence/absence type
> data.  I have numerous covariates I am fitting to explain variation, and I
> am using AIC to rank models.  However, I would like to report how well my
> best model (s) do at prediction.  I have looked over the archives and the
> web and have come up with something that gives me what I think is the mean
> prediction error, BUT I am not sure of that. I am sort of unfamiliar with
> these types of statistics.  Here is my code:
> 
> 
> metrics.global<-glm(Type~MPI+IJI+ED+PRD+class2+class3+class5,
> family=binomial, data=metrics)## ##Type is my binary response 0 or 1
> 
> muhat<-metrics.global$fitted.values
> ##assigns the fitted values a name muhat
> global.diag<-glm.diag(metrics.global)
> ##creates a the diagnostic values
> cv.err<-mean((metrics.global$y-muhat)^2/(1-global.diag$h)^2)
> ###calculates cv.err
> cv.err
> 
> 
> My main problem is I am unsure how to interpret what cv.err means for my
> model.  I know that h is a leverage statistic for each observation.  I
> would appreciate some interpretation clarification.
> 
> Thank you.
> 
> 
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Prediction-Error-Calculation-tp26031236p26113145.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.