I am having a problem understanding the lda package. I have a dataset here:

    [,1] [,2] [,3]
[1,] 2.95 6.63    0
[2,] 2.53 7.79    0
[3,] 3.57 5.65    0
[4,] 3.16 5.47    0
[5,] 2.58 4.46    1
[6,] 2.16 6.22    1
[7,] 3.27 3.52    1

If I do the following;

"names(d)<-c("y","x1","x2")
d$x1 = d$x1 * 100
d$x2 = d$x2 * 100
g<-lda( y ~ x1 + x2, data=d)
v2 <- predict(g, d)",

I get;
        LD1
1 -2.3769280
2 -2.7049437
3 -3.4748309
4 -0.9599825
5  4.2293774
6  2.6052193
7  2.6820884

However, If I do it manually,

"rawdata<-matrix(scan("tab1_1.
>
> dat"),ncol=3,byrow=T)
> group <- rawdata[,1]
> X <- 100 * rawdata[,2:3]
> Apf <- X[group==1,]
> Af <- X[group==0,]
> xbar1 <- apply(Af, 2, mean)
> S1 <- var(Af)
> N1 <- dim(Af)[1]
> xbar2 <- apply(Apf, 2, mean)
> S2 <- var(Apf)
> N2 <- dim(Apf)[1]
> S<-((N1-1)*S1+(N2-1)*S2)/(N1+N2-2)
> Sinv=solve(S)
> d<-xbar1-xbar2
> b <- Sinv %*% d
> v <- X %*% b",
>
> I get;
>
>         [,1]
> [1,] 164.4283
> [2,] 166.2492
> [3,] 170.5232
> [4,] 156.5622
> [5,] 127.7540
> [6,] 136.7704
> [7,] 136.3436
>






>
> I am having a problem understanding the lda package. I have a dataset here:
>
>     [,1] [,2] [,3]
> [1,] 2.95 6.63    0
> [2,] 2.53 7.79    0
> [3,] 3.57 5.65    0
> [4,] 3.16 5.47    0
> [5,] 2.58 4.46    1
> [6,] 2.16 6.22    1
> [7,] 3.27 3.52    1
>
> If I do the following;
>
> "names(d)<-c("y","x1","x2")
> d$x1 = d$x1 * 100
> d$x2 = d$x2 * 100
> g<-lda( y ~ x1 + x2, data=d)
> v2 <- predict(g, d)",
>
> I get;
>         LD1
> 1 -2.3769280
> 2 -2.7049437
> 3 -3.4748309
> 4 -0.9599825
> 5  4.2293774
> 6  2.6052193
> 7  2.6820884
>
> However, If I do it manually,
>
> "rawdata<-matrix(scan("tab1_1.dat"),ncol=3,byrow=T)
> group <- rawdata[,1]
> X <- 100 * rawdata[,2:3]
> Apf <- X[group==1,]
> Af <- X[group==0,]
> xbar1 <- apply(Af, 2, mean)
> S1 <- var(Af)
> N1 <- dim(Af)[1]
> xbar2 <- apply(Apf, 2, mean)
> S2 <- var(Apf)
> N2 <- dim(Apf)[1]
> S<-((N1-1)*S1+(N2-1)*S2)/(N1+N2-2)
> Sinv=solve(S)
> d<-xbar1-xbar2
> b <- Sinv %*% d
> v <- X %*% b",
>
> I get;
>
>         [,1]
> [1,] 164.4283
> [2,] 166.2492
> [3,] 170.5232
> [4,] 156.5622
> [5,] 127.7540
> [6,] 136.7704
> [7,] 136.3436
>
>
> It seems there is an extra step that I am missing? The predict step that
> adds a constant to the second set of values? Can anyone clear this up for
> me?
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to