G'Day R users!

Following an ordination using prcomp, I'd like to test which variables singnificantly contribute to a principal component. There is a method suggested by Peres-Neto and al. 2003. Ecology 84:2347-2363 called "bootstrapped eigenvector". It was asked for that in this forum in January 2005 by Jérôme Lemaître: "1) Resample 1000 times with replacement entire raws from the original data sets []
2) Conduct a PCA on each bootstrapped sample
3) To prevent axis reflexion and/or axis reordering in the bootstrap, here are two more steps for each bootstrapped sample 3a) calculate correlation matrix between the PCA scores of the original and those of the bootstrapped sample 3b) Examine whether the highest absolute correlation is between the corresponding axis for the original and bootstrapped samples. When it is not the case, reorder the eigenvectors. This means that if the highest correlation is between the first original axis and the second bootstrapped axis, the loadings for the second bootstrapped axis and use to estimate the confidence interval for the original first PC axis. 4) Determine the p value for each loading. Obtained as follow: number of loadings >=0 for loadings that were positive in the original matrix divided by the number of boostrap samples (1000) and/or number of loadings =<0 for loadings that were negative in the original matrix divided by the number of boostrap samples (1000)."

(see https://stat.ethz.ch/pipermail/r-help/2005-January/065139.html ).

The suggested solution (by Jari Oksanen) was


function (x, permutations=1000, ...)
{
   pcnull <- princomp(x, ...)
   res <- pcnull$loadings
   out <- matrix(0, nrow=nrow(res), ncol=ncol(res))
   N <- nrow(x)
   for (i in 1:permutations) {
       pc <- princomp(x[sample(N, replace=TRUE), ], ...)
       pred <- predict(pc, newdata = x)
       r <-  cor(pcnull$scores, pred)
       k <- apply(abs(r), 2, which.max)
       reve <- sign(diag(r[k,]))
       sol <- pc$loadings[ ,k]
       sol <- sweep(sol, 2, reve, "*")
       out <- out + ifelse(res > 0, sol <=  0, sol >= 0)
   }
   out/permutations
}

However, in a post from March 2005 ( http://r-help.com/msg/6125.html ) Jari himself mentioned that there is a bug in this method.

I was wondering whether someone could tell me where the bug is or whether there is a better method in R to test for significance of loadings (not the significance of the PCs). Maybe it is not a good idea to do it at all, but I would prefer to have some guidline for interpretation rather than making decisions arbitrarily. I tried to look everywhere before posting here.

I would be very thankful for any help,

Axel

--
Gravity is a habit that is hard to shake off.
Terry Pratchett

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to