What is your sample size? I've had trouble getting reliable estimates using simple data splitting when N < 20,000.
Note that the following functions in the rms package facilitates cross-validation and bootstrapping for validating models: ols, validate, calibrate. Frank Andra Isan wrote: > > Hi, > > Thanks for the reply. What I meant is that, I would like to partition my > dat data (a data frame) into training and testing data and then evaluate > the performance of the model on test data. So, I thought cross validation > is the natural choice to see how the prediction works on the hold-out > data. Is there any example that I can take a look to see how to do cross > validation and get the prediction results on my data? > > Thanks a lot, > Andra > > --- On Wed, 8/24/11, Prof Brian Ripley <rip...@stats.ox.ac.uk> > wrote: > >> From: Prof Brian Ripley <rip...@stats.ox.ac.uk> >> Subject: Re: [R] How to do cross validation with glm? >> To: "Andra Isan" <andra_i...@yahoo.com> >> Cc: r-help@r-project.org >> Date: Wednesday, August 24, 2011, 10:11 AM >> What you describe is not >> cross-validation, so I am afraid we do not know what you >> mean. And cv.glm does 'prediction for the hold-out >> data' for you: you can read the code to see how it does so. >> >> I suspect you mean you want to do validation on a test set, >> but that is not what you actually >> claim. There are lots of examples of this >> sort of thing in MASS (the book, scripts in the package). >> >> On Wed, 24 Aug 2011, Andra Isan wrote: >> >> > Hi All, >> > >> > I have a fitted model called glm.fit which I used glm >> and data dat is my data frame >> > >> > pred= predict(glm.fit, data = dat, type="response") >> > >> > to predict how it predicts on my whole data but >> obviously I have to do cross-validation to train the model >> on one part of my data and predict on the other part. So, I >> searched for it and I found a function cv.glm which is in >> package boot. So, I tired to use it as: >> > >> > cv.glm = (cv.glm(dat, glm.fit, cost, >> K=nrow(dat))$delta) >> > >> > but I am not sure how to do the prediction for the >> hold-out data. Is there any better way for cross-validation >> to learn a model on training data and test it on test data >> in R? >> > >> > Thanks, >> > Andra >> > >> > ______________________________________________ >> > R-help@r-project.org >> mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, >> reproducible code. >> > >> >> -- Brian D. Ripley, >> rip...@stats.ox.ac.uk >> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ >> University of Oxford, >> Tel: +44 1865 272861 (self) >> 1 South Parks Road, >> +44 1865 >> 272866 (PA) >> Oxford OX1 3TG, UK >> Fax: +44 1865 272595 >> > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ----- Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/How-to-do-cross-validation-with-glm-tp3765994p3766108.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.