Prof Brian Ripley wrote: > On Sun, 20 Apr 2008, Gad Abraham wrote: > >> Hi, >> >> Say x.train is a matrix of covariates that I want to do PCA on, so I can >> do regression on its principal components, and x.test is a test set of >> the same covariates on which I want to evaluate the regression fit. I >> would like the covariates to be centred and scaled: >> >> p <- prcomp(x.train, center=TRUE, scale=TRUE) >> x.train.pc <- predict(p) >> >> Now I want to get the PCs from the test set. > > The way to do that is to call prcomp() on the test set. > > If you want to project new data onto the PCs of the training set (as a > set of axes in the data space), you just use predict(p, newdata=). > >> Should I use the same center and scale vectors from the training set: >> >> x.test.pc <- predict(p, newdata=x.test, center=p$center, scale=p$center) >> >> or use the training set's own centers and scales: >> >> x.test.pc <- predict(p, newdata=x.test, center=TRUE, scale=TRUE) > > I see no evidence that those additional arguments are used. > > predict.prcomp uses the origin of the training set's PCs, since it is > that coordinate system which you are projecting onto. >
I should've have looked more carefully, now I see that in the code for predict.prcomp the test data will indeed get centred and scaled according to the training data's vectors: getAnywhere(predict.prcomp) ... scale(newdata, object$center, object$scale) %*% object$rotation Thanks, Gad -- Gad Abraham Dept. CSSE and NICTA The University of Melbourne Parkville 3010, Victoria, Australia email: [EMAIL PROTECTED] web: http://www.csse.unimelb.edu.au/~gabraham ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.