You should also have a look at the pcaMethods package (on Bioconductor). I did some code optimization for the NIPALS algorithm in that package which sped up the algorithm by at least a factor of two.
Kevin Wright On Wed, Apr 30, 2008 at 10:56 PM, steven wilson <[EMAIL PROTECTED]> wrote: > Dear list members; > > The code given below corresponds to the PCA-NIPALS (principal > component analysis) algorithm adapted from the nipals function in the > package chemometrics. The reason for using NIPALS instead of SVD is > the ability of this algorithm to handle missing values, but that's a > different story. I've been trying to find a way to improve (if > possible) the efficiency of the code, especially when the number of > components to calculate is higher than 100. I've been working with a > data set of 500 rows x 700 variables. The code gets really slow when > the number of PC to calculate is for example 600 (why that number of > components?....because I need them to detect outliers using another > algorithm). In my old laptop running Win XP and R 2.7.0 (1GB RAM) it > takes around 6 or 7 minutes. That shouldn't be a problem for one > analysis, but when cross-validation is added the time increases > severely.....Although there are several examples on the R help list > giving some with 'hints' to improve effciency the truth is that I > don't know (or I can't see it) the part of the code that can be > improved (but it is clear that the bottle neck is the for and while > loops). I would really appreciate any ideas/comments/etc... > > Thanks > > ################################################################# > > pca.nipals <- function(X, ncomp, iter = 50, toler = sqrt(.Machine$double.eps)) > # X...data matrix, ncomp...number of components, > # iter...maximal number of iterations per component, > # toler...precision tolerance for calculation of components > { > > Xh <- scale(X, center = TRUE, scale = FALSE) > nr <- 0 > T <- NULL; P <- NULL # matrix of scores and loadings > for (h in 1:ncomp) > { > th <- Xh[, 1] > ende <- FALSE > while (!ende) > { > nr <- nr + 1 > ph <- t(crossprod(th, Xh) * as.vector(1 / > crossprod(th))) > ph <- ph * as.vector(1/sqrt(crossprod(ph))) > thnew <- t(tcrossprod(t(ph), Xh) * > as.vector(1/(crossprod(ph)))) > prec <- crossprod(th-thnew) > th <- thnew > if (prec <= (tol^2)) ende <- TRUE > if (it <= nr) ende <- TRUE # didn't converge > } > > Xh <- Xh - tcrossprod(th) > T <- cbind(T, th); P <- cbind(P, ph) > nr <- 0 > } > list(T = T, P = P) > } > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.