On Jun 18, 2011, at 16:26 , Bert Gunter wrote:

> Apologies for the obvious, but just to clarify: there is no reason to
> "justify" a PCA -- it's just an eigen decomposition of a matrix and is
> therefore "justified" by linear algebra.
> 
> If one wants to determine whether some subset of the eigenvectors =
> principal components suffice to "represent" the data in some sense,
> then that is where distributional considerations would come into play.
> But that is another (often unsatisfactory) story, typically irrelevant
> in the exploratory context where PCA is often used.

Yes, I was wondering about that too. PCA on independent variables just sorts 
them by variance. PCA on their correlation matrix is essentially a random 
orthogonal rotation. So PCA is nonsensical if there is no correlation, but it 
can be pretty useless even if there is. 

Apparently the KMO/Bartlett "justification" comes out of SPSS usage, where a 
subculture has emerged in which it is conventional to cite those two 
quantities. If you google for "KMO", you'll find oodles of papers using the 
statistics, but precious few pages actually discussing or even defining it. 
Shame; the "adequate sampling" notion underlying the KMO measure could do with 
a qualified discussion. 

(Within such subcultures there often arises an ideology that software is 
somehow flawed if it does not provide their favorite quantities, relevant or 
not. What it really is is classical group dynamics, as in "you can't go to the 
opera if you don't own a tuxedo". See also "bandwagon effect".)  

-pd

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd....@cbs.dk  Priv: pda...@gmail.com

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to