Re: [R] Ridge regression

Eleni Christodoulou Fri, 08 Jan 2010 08:20:26 -0800

I am sorry, I just pressed the "send" button by accident before completing
my e-mail. The yest are the estimated values according to the ridge model.
Is the way that I calculate them correct? Or should I cut the
*+coef(ridge.test)[1]
*term?


Thanks a lot!
Eleni

On Fri, Jan 8, 2010 at 6:16 PM, Eleni Christodoulou <[email protected]>wrote:

> Hello again and Happy 2010!
> I was looking back at this email because I need to do some additional
> processing now. I was thinking that if I take the coef(ans) I get n+1
> coefficients. I guess that the coef(ans)[1] is the constant term... Do I
> need to add it when I calculate the estimated value for the outcome?
> For example, lets say that I have divided my data into training data and
> test data and I have the corresponding observed try_values and tey_values
> (the real values for the samples that belong to the training set and the
> test set respectively)
> Here is my code:
> *
> library(MASS)
>      ridge.test=lm.ridge(tey_values~tedata,lambda)
>     est<-list()
>     yest<-numeric()
>     for(i in 1:length(tey_values)){
>         est[[i]]=coef(ridge.test)[-1]*tedata[i,]
>         yest[i]=sum(est[[i]])+coef(ridge.test)[1]
>     }*
>
>
>
> On Wed, Dec 2, 2009 at 8:22 PM, Ravi Varadhan <[email protected]> wrote:
>
>> The help page clearly states that ans$coef is "not on the original scale
>> and
>> are for use by the coef method".  You also see that ans$scales gives you
>> the
>> scales used in the computation of ans$coef.
>>
>> So, to get coefficients on the original scale, you can either use
>> coef(ans)
>> or you can divide ans$coef by ans$scales.
>>
>> X1 <- runif(20)
>> X2 <- runif(20)
>> Y <- 2 * X1 - 2 * X2 + rnorm(20, sd=0.1)
>>
>> lam <- 10
>> ans1 <- lm.ridge(Y ~ X1 + X2, lambda = lam)
>>
>> all.equal(ans1$coef / ans1$scales, coef(ans1)[2:3] )
>>
>> Hope this helps,
>> Ravi.
>>
>>
>> ----------------------------------------------------------------------------
>> -------
>>
>> Ravi Varadhan, Ph.D.
>>
>> Assistant Professor, The Center on Aging and Health
>>
>> Division of Geriatric Medicine and Gerontology
>>
>> Johns Hopkins University
>>
>> Ph: (410) 502-2619
>>
>> Fax: (410) 614-9625
>>
>> Email: [email protected]
>>
>> Webpage:
>>
>> http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
>> tml<http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h%0Atml>
>>
>>
>>
>>
>> ----------------------------------------------------------------------------
>> --------
>>
>>
>> -----Original Message-----
>> From: [email protected] [mailto:[email protected]]
>> On
>> Behalf Of Ravi Varadhan
>> Sent: Wednesday, December 02, 2009 12:25 PM
>> To: 'David Winsemius'; 'Eleni Christodoulou'
>> Cc: [email protected]
>> Subject: Re: [R] Ridge regression
>>
>> You are right that the ans$coef and coef(ans) are different in ridge
>> regression, where `ans' is the object from lm.ridge.  It is the coef(ans)
>> that yields the coefficients on the original scale.  ans$coef is the
>> coefficient of "X-scaled" and "Y-centered" version.
>>
>> Here is an example that illustrates the workings of ridge regression.
>>
>> First let us create some data:
>>
>> X1 <- runif(20)
>> X2 <- runif(20)
>> Y <- 2 * X1 - 2 * X2 + rnorm(20, sd=0.1)
>>
>> lam <- 10
>> ans1 <- lm.ridge(Y ~ X1 + X2, lambda = lam)
>> ans1$coef
>> coef(ans1)
>> # Note that these two are different
>>
>> # Now Let us scale the variables X1 and X2 and center Y
>> #
>> cY <- scale(Y, scale=FALSE)
>> n <- length(Y)
>> sX1 <- scale(X1) * sqrt(n/(n-1))
>> sX2 <- scale(X2) *  sqrt(n/(n-1))
>>
>> require(MASS)
>>
>> lam <- 10
>> ans2 <- lm.ridge(cY ~ sX1 + sX2, lambda = lam)
>>
>> ans2$coef
>> coef(ans2)
>> # Now, see that the coefficients of sX1 and sX2 are the same
>> # This is the connection!
>>
>> # Armed with this insight, we now compare the ans1$coef with scaled
>> coefficients
>> #
>> ans1$coef
>> c(coef(ans1)[2] * sd(X1), coef(ans1)[3] * sd(X2)) * sqrt((n-1)/n)
>>
>> # Now they are the same!
>>
>> I hope this is clear.
>>
>> Best,
>> Ravi.
>>
>>
>> ----------------------------------------------------------------------------
>> -------
>>
>> Ravi Varadhan, Ph.D.
>>
>> Assistant Professor, The Center on Aging and Health
>>
>> Division of Geriatric Medicine and Gerontology
>>
>> Johns Hopkins University
>>
>> Ph: (410) 502-2619
>>
>> Fax: (410) 614-9625
>>
>> Email: [email protected]
>>
>> Webpage:
>>
>> http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
>> tml<http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h%0Atml>
>>
>>
>>
>>
>> ----------------------------------------------------------------------------
>> --------
>>
>> -----Original Message-----
>> From: [email protected] [mailto:[email protected]]
>> On
>> Behalf Of David Winsemius
>> Sent: Wednesday, December 02, 2009 11:04 AM
>> To: Eleni Christodoulou
>> Cc: [email protected]
>> Subject: Re: [R] Ridge regression
>>
>>
>> On Dec 2, 2009, at 10:42 AM, Eleni Christodoulou wrote:
>>
>> > Dear list,
>> >
>> > I have a couple of questions concerning ridge regression. I am using
>> > the
>> > lm.ridge(...) function in order to fit a model to my microarray data.
>> > Thus *model=lm.ridge(...)*
>> > I retrieve some coefficients and some scales for each gene. First of
>> > all, I
>> > would like to ask: the real coefficients of the model are not
>> > included in
>> > the first argument of the output but in the result of coef(model),
>> > am I
>> > right?
>>
>> Not exactly. coef(model) extracts the coefficients from the model but
>> the coefficients do in the example instance I created following the
>> help page happen to be in the first element of the model.
>>
>> eg:
>>  > long.rr$coef
>>          GNP   Unemployed Armed.Forces   Population         Year
>> Employed
>>   25.3615288    3.3009416    0.7520553  -11.6992718   -6.5403380
>> 0.7864825
>>  > long.rr[[1]]
>>          GNP   Unemployed Armed.Forces   Population         Year
>> Employed
>>   25.3615288    3.3009416    0.7520553  -11.6992718   -6.5403380
>> 0.7864825
>>
>> > Moreover, what does the scale argument represent? Which is its
>> > connection with the coefficients? The R help file os not very
>> > informative
>> > for me...
>>
>> A plausible response to such a question might be that the help page is
>> a sketchy substitute for the MASS book. However, I cannot find ridge
>> regression in the table of contents or in the index of my copy, but I
>> only have ed. 2 and the current edition is the 4th. So we will both
>> need to wait for more knowledgeable (or with more recent editions of
>> MASS) persons to answer that question.
>>
>> (And "scales" is not an argument, rather it's a returned value.)
>>
>> >
>> > Thank you very much in advance,
>> > Eleni Christodoulou
>> >
>>
>>
>> David Winsemius, MD
>> Heritage Laboratories
>> West Hartford, CT
>>
>> ______________________________________________
>> [email protected] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> [email protected] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>

        [[alternative HTML version deleted]]

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Ridge regression

Reply via email to