Thank you, both train and test are originated from the same data object. attached the missing code:
data<-read.csv("old4.csv", header=TRUE) library(imputeMissings) data<-impute(data,object = NULL ,method = "median/mode") for (i in col[13:68]) { data[i]<-lapply(data[i], factor) } for (i in col[1:12]) { data[i]<-lapply(data[i], numeric) } data$TIME<-as.numeric(data$TIME) data<-data[-c(61,62,64,65,66,67,68)] data$TIME<-ceiling(data$TIME/12) data$TIME[which(data$TIME==37)]<-36 data1 = sort(sample(nrow(data), nrow(data)*.7)) train<-data[data1,] test<-data[-data1,] so test should be the exact same, and i still can't find the issue, thank you Amir On Sat, Nov 16, 2019 at 12:00 AM David Winsemius <dwinsem...@comcast.net> wrote: > > On 11/15/19 10:49 AM, Amir Hadanny wrote: > > Hi all, > > i'm trying to get the prediction probabilities for a survival elastic > net. > > When i use try to predict using the train model on the test set, it > creates > > an object with the number rows of the train data (6400 rows) instead of > the > > test data (2400 rows). I really don't understand why, and that doesn't > let > > me check for performance c-index. > > > If you call most `predict` functions with a second argument that fails > to contain the predictors in the model, it returns the predictions on > the original data. The only place where the `test` object appears prior > to the predict operation is in your call to `predict.coxph`, so my guess > is that it fails to meet the requirements of the function for a valid > newdata argument. (Another thought was that maybe `test` didn't exist, > but that should have thrown an error with the predict call and the nrow > call.) > > > But since you don't provide code that creates `test` or even an > unambiguous way of examining its structure, that is entirely a guess. > > > And finally ... Rhelp is a plain text mailing list, so please to read > the message at the bottom of every transmission from the mailserver ... > i.e. read the Posting Guide. (It is not at all difficult to get > gmail.com to send plain text.) > > > -- > > David. > > > the code: > > > > data<-read.csv("old4.csv", header=TRUE) > > library(imputeMissings) > > data<-impute(data,object = NULL ,method = "median/mode") > > > > trainstatus<-train$DIED1095 > > trainTime<-train$TIME > > y<-Surv(trainTime,trainstatus) > > > > trainX<-train[-c(12,63,64,65,66,67,68,69,70,71)] > > x<-data.matrix(trainX) > > > > > > library(glmnet) > > fit <- glmnet(x,Surv(trainTime,trainstatus),family="cox",alpha=0.1, > > ,maxit=10000) > > max.dev.index <- which.max(fit$dev.ratio) > > optimal.lambda <- fit$lambda[max.dev.index] > > optimal.beta <- fit$beta[,max.dev.index] > > nonzero.coef <- abs(optimal.beta)>0 > > selectedBeta <- optimal.beta[nonzero.coef] > > selectedTrainX <- x[,nonzero.coef] > > > > coxph.model<- coxph(Surv(train$TIME,train$DIED365) ~x,data=train, > > init=selectedBeta,iter=0) > > coxph.predict<-predict(coxph.model,test) > > > > nrow(test) > > 2872 > > > > nrow(train > > 6701 > > > > length(coxph.predict) > > 6701 > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.