Thanks a lot Joshua. That clearly solved my problem. I actually tried number 3 and it works perfectly fine. I used the prediction function as follows:
pred= predict(glm.fit,data = dat, type="response") (glm.fit is my fitted model) to predict how it predicts on my whole data but obviously I have to do cross-validation to train the model on one part of my data and predict on the other part. So, I searched for it and I found a function cv.glm which is in package boot. So, I tired to use it as: cv.glm = (cv.glm(dat, glm.fit, cost, K=nrow(dat))$delta) but I am not sure how to do the prediction for the hold-out data. Is there any better way for cross-validation to learn a model on training data and test it on test data in R? Thanks, Andra --- On Mon, 8/22/11, Joshua Wiley <jwiley.ps...@gmail.com> wrote: > From: Joshua Wiley <jwiley.ps...@gmail.com> > Subject: Re: [R] GLM question > To: "Andra Isan" <andra_i...@yahoo.com> > Cc: r-help@r-project.org > Date: Monday, August 22, 2011, 9:54 PM > Hi Andra, > > There are several problems with what you are doing (by the > way, I > point them out so you can learn and improve, not to be > harsh or rude). > The good news is there is a solution (#3) that is easier > than what > you are doing right now! > > 1) glm.fit() is a function so it is a good idea not to use > it as a variable > > 2) You are looping through your variables, when you could > avoid the > loop and use: > paste(x, collapse = " + ") > > for example with the first ten letters of the alphabet: > > > paste(LETTERS[1:10], collapse = " + ") > [1] "A + B + C + D + E + F + G + H + I + J" > > 3) If you store your data in a data frame like: > > dat <- as.data.frame(cbind(Y = y, x)) > > you do not need to do anything other than: > > glm(Y ~ ., data = dat, family = binomial) > > because R will expand the "." to be every variable in the > dataset that > is not the outcome. This would be my recommendation. > > 4) If you really wanted to use your pasted string, try it > like this: > > f <- "mpg ~ hp" # create formula as string > lm(as.formula(f), data = mtcars) # convert to formula and > use in model > > although there are many variants of this some of which may > be better. > Still, I would recommend #3 in your case over #4. > > I hope this helps, > > Josh > > On Mon, Aug 22, 2011 at 9:43 PM, Andra Isan <andra_i...@yahoo.com> > wrote: > > Hi All, > > > > I am trying to fit my data with glm model, my data is > a matrix of size n*100. So, I have n rows and 100 columns > and my vector y is of size n which contains the labels (0 or > 1) > > > > My question is: > > instead of manually typing the model as > > glm.fit = glm(y~ x[,1]+x[,2]+...+x[,100], > family=binomial()) > > > > I have a for loop as follows that concatenates the x > variables as follows: > > > > final_str=NULL > > for (m in 1:100){ > > str = paste(x[,m],+,sep="") > > final_str= paste(final_str,str,sep="") > > } > > > > glm.fit = flm(y~final_str,family=binomial()) > > but final_str is treated as a string and it does not > work. Could you please help me with fixing that? > > > > Thanks a lot, > > Andra > > > > ______________________________________________ > > R-help@r-project.org > mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, > reproducible code. > > > > > > -- > Joshua Wiley > Ph.D. Student, Health Psychology > Programmer Analyst II, ATS Statistical Consulting Group > University of California, Los Angeles > https://joshuawiley.com/ > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.