Re: [R] prcomp - principal components in R

Tony Plate Mon, 09 Nov 2009 11:27:42 -0800

The output of summary prcomp displays the cumulative amount of variance explained 
relative to the total variance explained by the principal components PRESENT in the 
object.  So, it is always guaranteed to be at 100% for the last principal component 
present.  You can see this from the code in summary.prcomp() (see this code with 
getAnywhere("summary.prcomp")).


Here's how to get the output you want (the last line in the transcript below):

set.seed(1)
summary(pc1 <- prcomp(x))

Importance of components:
                        PC1   PC2   PC3   PC4   PC5
Standard deviation     1.175 1.058 0.976 0.916 0.850
Proportion of Variance 0.275 0.223 0.190 0.167 0.144
Cumulative Proportion  0.275 0.498 0.688 0.856 1.000

summary(pc2 <- prcomp(x, tol=0.8))

Importance of components:
                       PC1   PC2   PC3
Standard deviation     1.17 1.058 0.976
Proportion of Variance 0.40 0.324 0.276
Cumulative Proportion  0.40 0.724 1.000

pc2$sdev

[1] 1.1749061 1.0581362 0.9759016

pc1$sdev

[1] 1.1749061 1.0581362 0.9759016 0.9164905 0.8503122

svd(scale(x, center=T, scale=F))$d / sqrt(nrow(x)-1)

[1] 1.1749061 1.0581362 0.9759016 0.9164905 0.8503122

cumsum(pc1$sdev^2) / sum((svd(scale(x, center=T, scale=F))$d / 
sqrt(nrow(x)-1))^2)

[1] 0.2752317 0.4984734 0.6883643 0.8558386 1.0000000


# output in terms of the cumulative % of the total variance
cumsum(pc2$sdev^2) / sum((svd(scale(x, center=T, scale=F))$d / 
sqrt(nrow(x)-1))^2)

[1] 0.2752317 0.4984734 0.6883643


It's probably better to get prcomp to compute all the components in the first 
place, because the SVD is the bulk of the computation anyway (so doing it again 
will be slower for large matrices.)  Then just look at the most important 
principal components.  However, there may be a shortcut for computing the 
values of D in the SVD of a matrix -- you could look for that if you have 
demanding computations (e.g., the sqrts of the eigen values of the covariance 
matrix of scaled x: sqrt(eigen(var(scale(x, center=T, scale=F)), 
only.values=T)$values)).

-- Tony Plate


zubin wrote:

Hello, not understanding the output of prcomp, I reduce the number ofcomponents and the output continues to show cumulative 100% of thevariance explained, which can't be the case dropping from 8 componentsto 3.How do i get the output in terms of the cumulative % of the totalvariance, so when i go from total solution of 8 (8 variables in the dataset), to a reduced number of components, i can evaluate % of varianceexplained, or am I missing something??
8 variables in the data set

 > princ = prcomp(df[,-1],rotate="varimax",scale=TRUE)
 > summary(princ)
Importance of components:
                         PC1   PC2   PC3   PC4   PC5   PC6    PC7    PC8
Standard deviation     1.381 1.247 1.211 0.994 0.927 0.764 0.6708 0.4366
Proportion of Variance 0.238 0.194 0.183 0.124 0.107 0.073 0.0562 0.0238
Cumulative Proportion  0.238 0.433 0.616 0.740 0.847 0.920 0.9762 *1.0000*

 > princ = prcomp(df[,-1],rotate="varimax",scale=TRUE,tol=.75)
 > summary(princ)

Importance of components:
                         PC1   PC2   PC3
Standard deviation     1.381 1.247 1.211
Proportion of Variance 0.387 0.316 0.297
Cumulative Proportion  0.387 0.703 *1.000*

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] prcomp - principal components in R

Reply via email to