Your results are the same (after scaling and sign reversal) out to the
4th decimal place as those from lda (which by the way is almost
certainly from the MASS package and not from an impossible to find
"lda package".)
> read.table(textConnection(txt))
V1
1 164.4283
2 166.2492
3 170.5232
4 156.5622
5 127.7540
6 136.7704
7 136.3436
> est <-read.table(textConnection(txt))
> scale(est)
V1
[1,] 0.7656185
[2,] 0.8712707
[3,] 1.1192567
[4,] 0.3092117
[5,] -1.3622976
[6,] -0.8391481
[7,] -0.8639119
attr(,"scaled:center")
V1
151.233
attr(,"scaled:scale")
V1
17.23484
> LD1est <- read.table(textConnection(" LD1
+ 1 -2.3769280
+ 2 -2.7049437
+ 3 -3.4748309
+ 4 -0.9599825
+ 5 4.2293774
+ 6 2.6052193
+ 7 2.6820884"), header=T)
> scale(LD1est)
LD1
1 -0.7656170
2 -0.8712721
3 -1.1192555
4 -0.3092138
5 1.3622976
6 0.8391505
7 0.8639103
attr(,"scaled:center")
LD1
-3.172066e-17
attr(,"scaled:scale")
LD1
3.104591
On Sep 28, 2009, at 5:54 PM, Pete Shepard wrote:
I am having a problem understanding the lda package. I have a
dataset here:
[,1] [,2] [,3]
[1,] 2.95 6.63 0
[2,] 2.53 7.79 0
[3,] 3.57 5.65 0
[4,] 3.16 5.47 0
[5,] 2.58 4.46 1
[6,] 2.16 6.22 1
[7,] 3.27 3.52 1
If I do the following;
"names(d)<-c("y","x1","x2")
d$x1 = d$x1 * 100
d$x2 = d$x2 * 100
g<-lda( y ~ x1 + x2, data=d)
v2 <- predict(g, d)",
I get;
LD1
1 -2.3769280
2 -2.7049437
3 -3.4748309
4 -0.9599825
5 4.2293774
6 2.6052193
7 2.6820884
However, If I do it manually,
"rawdata<-matrix(scan("tab1_1.
dat"),ncol=3,byrow=T)
group <- rawdata[,1]
X <- 100 * rawdata[,2:3]
Apf <- X[group==1,]
Af <- X[group==0,]
xbar1 <- apply(Af, 2, mean)
S1 <- var(Af)
N1 <- dim(Af)[1]
xbar2 <- apply(Apf, 2, mean)
S2 <- var(Apf)
N2 <- dim(Apf)[1]
S<-((N1-1)*S1+(N2-1)*S2)/(N1+N2-2)
Sinv=solve(S)
d<-xbar1-xbar2
b <- Sinv %*% d
v <- X %*% b",
I get;
[,1]
[1,] 164.4283
[2,] 166.2492
[3,] 170.5232
[4,] 156.5622
[5,] 127.7540
[6,] 136.7704
[7,] 136.3436
I am having a problem understanding the lda package. I have a
dataset here:
[,1] [,2] [,3]
[1,] 2.95 6.63 0
[2,] 2.53 7.79 0
[3,] 3.57 5.65 0
[4,] 3.16 5.47 0
[5,] 2.58 4.46 1
[6,] 2.16 6.22 1
[7,] 3.27 3.52 1
If I do the following;
"names(d)<-c("y","x1","x2")
d$x1 = d$x1 * 100
d$x2 = d$x2 * 100
g<-lda( y ~ x1 + x2, data=d)
v2 <- predict(g, d)",
I get;
LD1
1 -2.3769280
2 -2.7049437
3 -3.4748309
4 -0.9599825
5 4.2293774
6 2.6052193
7 2.6820884
However, If I do it manually,
"rawdata<-matrix(scan("tab1_1.dat"),ncol=3,byrow=T)
group <- rawdata[,1]
X <- 100 * rawdata[,2:3]
Apf <- X[group==1,]
Af <- X[group==0,]
xbar1 <- apply(Af, 2, mean)
S1 <- var(Af)
N1 <- dim(Af)[1]
xbar2 <- apply(Apf, 2, mean)
S2 <- var(Apf)
N2 <- dim(Apf)[1]
S<-((N1-1)*S1+(N2-1)*S2)/(N1+N2-2)
Sinv=solve(S)
d<-xbar1-xbar2
b <- Sinv %*% d
v <- X %*% b",
I get;
[,1]
[1,] 164.4283
[2,] 166.2492
[3,] 170.5232
[4,] 156.5622
[5,] 127.7540
[6,] 136.7704
[7,] 136.3436
It seems there is an extra step that I am missing? The predict step
that
adds a constant to the second set of values? Can anyone clear this
up for
me?
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.