Yes, you need to have the intercept term when you predict model-based response.
This is what you need: ridge.test=lm.ridge(tey_values~tedata, lambda) yest <- drop(cbind(1, tedata) %*% coef(ridge.test)) Hope this helps, Ravi. ---------------------------------------------------------------------------- ------- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: rvarad...@jhmi.edu Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h tml ---------------------------------------------------------------------------- -------- From: Eleni Christodoulou [mailto:elenic...@gmail.com] Sent: Friday, January 08, 2010 11:18 AM To: Ravi Varadhan Cc: David Winsemius; r-help@r-project.org Subject: Re: [R] Ridge regression I am sorry, I just pressed the "send" button by accident before completing my e-mail. The yest are the estimated values according to the ridge model. Is the way that I calculate them correct? Or should I cut the +coef(ridge.test)[1] term? Thanks a lot! Eleni On Fri, Jan 8, 2010 at 6:16 PM, Eleni Christodoulou <elenic...@gmail.com> wrote: Hello again and Happy 2010! I was looking back at this email because I need to do some additional processing now. I was thinking that if I take the coef(ans) I get n+1 coefficients. I guess that the coef(ans)[1] is the constant term... Do I need to add it when I calculate the estimated value for the outcome? For example, lets say that I have divided my data into training data and test data and I have the corresponding observed try_values and tey_values (the real values for the samples that belong to the training set and the test set respectively) Here is my code: library(MASS) ridge.test=lm.ridge(tey_values~tedata,lambda) est<-list() yest<-numeric() for(i in 1:length(tey_values)){ est[[i]]=coef(ridge.test)[-1]*tedata[i,] yest[i]=sum(est[[i]])+coef(ridge.test)[1] } On Wed, Dec 2, 2009 at 8:22 PM, Ravi Varadhan <rvarad...@jhmi.edu> wrote: The help page clearly states that ans$coef is "not on the original scale and are for use by the coef method". You also see that ans$scales gives you the scales used in the computation of ans$coef. So, to get coefficients on the original scale, you can either use coef(ans) or you can divide ans$coef by ans$scales. X1 <- runif(20) X2 <- runif(20) Y <- 2 * X1 - 2 * X2 + rnorm(20, sd=0.1) lam <- 10 ans1 <- lm.ridge(Y ~ X1 + X2, lambda = lam) all.equal(ans1$coef / ans1$scales, coef(ans1)[2:3] ) Hope this helps, Ravi. ---------------------------------------------------------------------------- ------- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: rvarad...@jhmi.edu Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h <http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan. h%0Atml> tml ---------------------------------------------------------------------------- -------- -----Original Message----- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Ravi Varadhan Sent: Wednesday, December 02, 2009 12:25 PM To: 'David Winsemius'; 'Eleni Christodoulou' Cc: r-help@r-project.org Subject: Re: [R] Ridge regression You are right that the ans$coef and coef(ans) are different in ridge regression, where `ans' is the object from lm.ridge. It is the coef(ans) that yields the coefficients on the original scale. ans$coef is the coefficient of "X-scaled" and "Y-centered" version. Here is an example that illustrates the workings of ridge regression. First let us create some data: X1 <- runif(20) X2 <- runif(20) Y <- 2 * X1 - 2 * X2 + rnorm(20, sd=0.1) lam <- 10 ans1 <- lm.ridge(Y ~ X1 + X2, lambda = lam) ans1$coef coef(ans1) # Note that these two are different # Now Let us scale the variables X1 and X2 and center Y # cY <- scale(Y, scale=FALSE) n <- length(Y) sX1 <- scale(X1) * sqrt(n/(n-1)) sX2 <- scale(X2) * sqrt(n/(n-1)) require(MASS) lam <- 10 ans2 <- lm.ridge(cY ~ sX1 + sX2, lambda = lam) ans2$coef coef(ans2) # Now, see that the coefficients of sX1 and sX2 are the same # This is the connection! # Armed with this insight, we now compare the ans1$coef with scaled coefficients # ans1$coef c(coef(ans1)[2] * sd(X1), coef(ans1)[3] * sd(X2)) * sqrt((n-1)/n) # Now they are the same! I hope this is clear. Best, Ravi. ---------------------------------------------------------------------------- ------- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: rvarad...@jhmi.edu Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h <http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan. h%0Atml> tml ---------------------------------------------------------------------------- -------- -----Original Message----- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of David Winsemius Sent: Wednesday, December 02, 2009 11:04 AM To: Eleni Christodoulou Cc: r-help@r-project.org Subject: Re: [R] Ridge regression On Dec 2, 2009, at 10:42 AM, Eleni Christodoulou wrote: > Dear list, > > I have a couple of questions concerning ridge regression. I am using > the > lm.ridge(...) function in order to fit a model to my microarray data. > Thus *model=lm.ridge(...)* > I retrieve some coefficients and some scales for each gene. First of > all, I > would like to ask: the real coefficients of the model are not > included in > the first argument of the output but in the result of coef(model), > am I > right? Not exactly. coef(model) extracts the coefficients from the model but the coefficients do in the example instance I created following the help page happen to be in the first element of the model. eg: > long.rr$coef GNP Unemployed Armed.Forces Population Year Employed 25.3615288 3.3009416 0.7520553 -11.6992718 -6.5403380 0.7864825 > long.rr[[1]] GNP Unemployed Armed.Forces Population Year Employed 25.3615288 3.3009416 0.7520553 -11.6992718 -6.5403380 0.7864825 > Moreover, what does the scale argument represent? Which is its > connection with the coefficients? The R help file os not very > informative > for me... A plausible response to such a question might be that the help page is a sketchy substitute for the MASS book. However, I cannot find ridge regression in the table of contents or in the index of my copy, but I only have ed. 2 and the current edition is the 4th. So we will both need to wait for more knowledgeable (or with more recent editions of MASS) persons to answer that question. (And "scales" is not an argument, rather it's a returned value.) > > Thank you very much in advance, > Eleni Christodoulou > David Winsemius, MD Heritage Laboratories West Hartford, CT ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.