On Feb 13, 2010, at 5:03 PM, Something Something wrote:
I tried..
mod = lm(Y ~ X1*X2*X3, na.action = na.exclude)
formula(mod)
This produced....
Y ~ X1 * X2 * X3
When I typed just mod I got:
Call:
lm(formula = Y ~ X1 * X2 * X3, na.action = na.exclude)
Coefficients:
(Intercept) X11 X21 X31 X11:X21
X11:X31
X21:X31 X11:X21:X31
177.9245 0.2005 2.4482 3.1216 0.8127
-26.6166
-3.0398 29.6049
I am trying to figure out how R computed all these coefficients.
From your question it is difficult to determine what sort of tutoring
you are expecting. To get the code of an R formula, you just type its
name:
lm
Leads to lm.fit:
lm.fit
Reading further it appears the lm and lm.fit functions are really
front ends for this call:
.Fortran("dqrls", qr = x, n = n, p = p, y = y, ny = ny,
tol = as.double(tol), coefficients = mat.or.vec(p, ny),
residuals = y, effects = y, rank = integer(1L), pivot = 1L:p,
qraux = double(p), work = double(2 * p), PACKAGE = "base")
Seems pretty likely that is a QR decomposition-based method that i
implemented in compiled code.
So if you want to go deeper, at least you know what to search for. Or
if you want to know how regression works on a matrix level, you should
consult a good reference text or Wikipedia, which is surprisingly good
for that sort of question these days.
--
David.
On Sat, Feb 13, 2010 at 1:30 PM, Bert Gunter
<gunter.ber...@gene.com> wrote:
?formula
Bert Gunter
Genentech Nonclinical Statistics
-----Original Message-----
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org
]
On
Behalf Of Something Something
Sent: Saturday, February 13, 2010 1:24 PM
To: Daniel Nordlund
Cc: r-help@r-project.org
Subject: Re: [R] lm function in R
Thanks Dan. Yes that was very helpful. I didn't see the change
from '*'
to
'+'.
Seems like when I put * it means - interaction & when I put + it's
not an
interaction.
Is it correct to assume then that...
When I put + R evaluates the following equation:
Y-Hat = b0 + b1X1 + b2X2 + . . . bkXk + 7 7 7 + bkXk
But when I put * R evaluates the following equation;
Y-Hat = b0 + b1X1 + b2x2 + ... + bkXk + b12 X12+ b13 X13 +........
+ c
Is this correct? If it is then can someone point me to any sources
that
will explain how the coefficients (such as b0... bk, b12.. ,
b123..) are
calculated. I guess, one source is the R source code :) but is
there any
other documentation anywhere?
Please let me know. Thanks.
On Fri, Feb 12, 2010 at 5:54 PM, Daniel Nordlund
<djnordl...@verizon.net>wrote:
-----Original Message-----
From: r-help-boun...@r-project.org [mailto:
r-help-boun...@r-project.org]
On Behalf Of Something Something
Sent: Friday, February 12, 2010 5:28 PM
To: Phil Spector; r-help@r-project.org
Subject: Re: [R] lm function in R
Thanks for the replies everyone. Greatly appreciate it. Some
progress,
but
now I am getting the following values when I don't use "as.factor"
13.14167 25.11667 28.34167 49.14167 40.39167 66.86667
Is that what you guys get?
If you look at Phil's response below, no, that is not what he
got. The
difference is that you are specifying an interaction, whereas Phil
did
not
(because the equation you initially specified did not include an
interaction. Use Y ~ X1 + X2 instead of Y ~ X1*X2 for your formula.
On Fri, Feb 12, 2010 at 5:00 PM, Phil Spector
<spec...@stat.berkeley.edu>wrote:
By converting the two variables to factors, you are fitting
an entirely different model. Leave out the as.factor stuff
and it will work exactly as you want it to.
dat
V1 V2 V3 V4
1 s1 14 4 1
2 s2 23 4 2
3 s3 30 7 2
4 s4 50 7 4
5 s5 39 10 3
6 s6 67 10 6
names(dat) = c('id','y','x1','x2')
z = lm(y~x1+x2,dat)
predict(z)
1 2 3 4 5 6 15.16667
24.66667
27.66667 46.66667 40.16667 68.66667
- Phil Spector
Statistical Computing
Facility
Department of Statistics
UC Berkeley
spec...@stat.berkeley.edu
On Fri, 12 Feb 2010, Something Something wrote:
Hello,
I am trying to learn how to perform Multiple Regression
Analysis in
R.
I
decided to take a simple example given in this PDF:
http://www.utdallas.edu/~herve/abdi-prc-pretty.pdf
I created a small CSV called, students.csv that contains the
following
data:
s1 14 4 1
s2 23 4 2
s3 30 7 2
s4 50 7 4
s5 39 10 3
s6 67 10 6
Col headers: Student id, Memory span(Y), age(X1), speech
rate(X2)
Now the expected results are:
yHat[0]:15.166666666666668
yHat[1]:24.666666666666668
yHat[2]:27.666666666666664
yHat[3]:46.666666666666664
yHat[4]:40.166666666666664
yHat[5]:68.66666666666667
This is based on the following equation (given in the PDF): Y =
1.67
+
X1
+
9.50 X2
I ran the following commands in R:
data = read.table("students.csv", head=F, as.is=T, na.string=".",
row.nam=NULL)
X1 = as.factor(data[[3]])
X2 = as.factor(data[[4]])
Y = data[[2]]
mod = lm(Y ~ X1*X2, na.action = na.exclude)
Y.hat = fitted(mod)
Y.hat
This gives me the following output:
Y.hat
1 2 3 4 5 6
14 23 30 50 39 67
Obviously I am doing something wrong. Please help. Thanks.
Hope this is helpful,
Dan
Daniel Nordlund
Bothell, WA USA
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.